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Preface 



This volume contains the papers selected for presentation at the conference and 
two abstracts from invited speakers. The programme committee selected these 
25 papers from 12 countries out of 65 submissions from 17 countries. 

The first JELIA meeting was in Roscoff, France, ten years ago. Afterwards, 
it took place in the Netherlands, Germany, United Kingdom, Portugal, and now 
again in Germany. The proceedings of the last four meetings appeared in the 
Springer- Verlag LNGS series, and a selected series of papers of the English and 
the Portuguese meeting appeared as special issues in the Journal of Applied 
Non-Classical Logics and in the Journal of Automated Reasoning, respectively. 

The aim of JELIA was and still is to provide a forum for the exchange of ideas 
and results in the domain of foundations of AI, focusing on rigorous descriptions 
of some aspects of intelligence. These descriptions are promoted by applications, 
and produced by logical tools and methods. The papers contained in this volume 
cover the following topics: 

1. Logic programming 

2. Epistemic logics 

3. Theorem proving 

4. Non-monotonic reasoning 

5. Non-standard logics 

6. Knowledge representation 

7. Higher order logics 

We would like to warmly thank the authors, the invited speakers, the mem- 
bers of the program committee, and the additional reviewers listed below. They 
all have made these proceedings possible and ensured their quality. 



August 1998 Jurgen Dix, Koblenz 

Luis Farinas del Gerro, Toulouse 
Ulrich Furbach, Koblenz 
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The Well-Founded Semantics 
Is 

the Principle of Inductive Definition 



Marc Denecker 

Department of Computer Science, K.U. Leuven, Belgium 
marcdOcs .kuleuven. ac .be 



Abstract. Existing formalisations of (transfinite) inductive definitions 
in constructive mathematics are reviewed and strong correspondences 
with LP under least model and perfect model semantics become appar- 
ent. I point to fundamental restrictions of these existing formalisations 
and argue that the well-founded semantics (wfs) overcomes these prob- 
lems and hence, provides a superior formalisation of the principle of 
inductive definition. The contribution of this study for LP is that it (re-) 
introduces the knowledge theoretic interpretation of LP as a logic for 
representing definitional knowledge. I point to fundamental differences 
between this knowledge theoretic interpretation of LP and the more com- 
monly known interpretations of LP as default theories or auto-epistemic 
theories. The relevance is that differences in knowledge theoretic inter- 
pretation have strong impact on knowledge representation methodology 
and on extensions of the LP formalism, for example for representing 
uncertainty. 



1 Introduction 

With the completion semantics [5], Clark aimed at formalising the meaning of 
a logic program as a set of definitions. To that aim, he maps a logic program 
to a set of First Order Logic (FOL) equivalences. Motivated by the research in 
Nonmonotonic Reasoning, logic programming is currently often seen as a default 
logic or auto-epistemic logic. In [11], Gelfond proposes a semantics for stratified 
logic programs based on an auto-epistemic interpretation of the formalism. In 
[12], Gelfond and Lifschitz motivate the stable semantics for logic programs from 
the perspective of logic programs as default and auto-epistemic theories. 

To compare these readings, consider the program Pq with unique rule: 

dead ^ not alive 

Pq is propositional and hierarchical; all common semantics of LP (completion / 
perfect [3,21] / stable [12] / wfs [28]) agree] for the above example, the unique 
model is {dead}. 

In the interpretation of this program as an auto-epistemic theory, Pq corre- 
sponds to the auto-epistemic theory (AEL): 
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AEL{Pq) = {dead ^ ^Kalive} 

which reads as: one is dead if it is not believed that one is alive. On the other 
hand, under completion semantics the meaning of this program is given by the 
FOL theory: 

comp{Po) = {^alive , dead ^ -^alive} 

These readings show important differences. The completion reading of Pq states 
that alive is false while the auto-epistemic reading of Pq gives no information 
about alive; hence alive is not known. The completion reading maps implica- 
tion to equivalence and negation to classical objective negation, while the auto- 
epistemic reading map negation to a modal operator ( ^K) and preserves the 
implication. 

How to explain that, despite this intuitive difference, the stable model -which 
formalises the default/auto-epistemic reading- corresponds to the model of the 
completion? The reason is that models in stable semantics and in classical logic 
play a different role. A stable model is a belief set. the set of atoms which 
are believed, while the model of the completion, as a model of a FOL theory, 
represents a possible state of the world. Because models in both semantics play 
a different role, a simple comparison between them does not reveal the different 
meanings of both semantics. 

Actually, a clear and correct model theoretic comparison of the meaning 
of the auto-epistemic reading and of the completion is possible if done on the 
basis of the possible world model of the auto-epistemic theory and of the set 
of models of the completion. Both are sets of models; in both sets the role of 
models is identical: they represent possible states of the world. Such a comparison 
confirms the intuitive differences between the two readings. The possible world 
model of the AEL theory {dead ^ ^Kalive} is {{dead}, {alive, dead}}. This 
set of models reflects indeed the intuitive meaning of AEL{Pq): alive can be 
true or false, hence nothing is known on alive; (therefore) dead is always true. 
Note that the belief set, i.e. the stable model, is the intersection of these possible 
states. In contrast, the set of models of the completion is the singleton {{dead}}. 
Interpreted as a possible world model, it represents that dead is known to be 
true, alive known to be false. 

This observation motivates a closer investigation of the relation between logic 
programming and inductive definitions. An inductive definition is a form of con- 
structive knowledge. Constructive information defines a relation ( or a collection 
of relations) through a constructive process of iterating a recursive recipe. This 
recipe defines new instances of the relation in terms of the presence (and some- 
times the absence) of other tuples of the relation. A broad class of human knowl- 
edges in many areas of human expertise, ranging from common sense knowledge 
situations to mathematics, is of constructive nature. One example is Reiter’s 
formalisation of situation calculus [23]; in this approach, a situation calculus 
can be understood as an inductive definition on the well-founded poset of sit- 
uations. Another example is in [26], where we argue that causality information 
in the context of the ramification problem is a form of constructive information. 
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Causes, effects and forces propagate in a dynamic system through a constructive 
process; consequently, the semantics of causality rules is defined by an inductive 
definition which, by its constructive nature, mirrors the physical process of the 
effect propagation. 

In the context of mathematics, constructive information appears by excel- 
lence in inductive definitions. For example, as suggested by the name, the tran- 
sitive closure of a binary relation is naturally perceived as the relation obtained 
through the construction process of closing the relation under the transitivity 
rule. Not a coincidence, inductive definitions have been studied in constructive 
mathematics and intuitionistic logic, in particular in the sub-areas of Induc- 
tive and Definition logics. Iterated Inductive Definition logics and Fixpoint log- 
ics. The main goal of this paper is to review some of this work and to show 
how inductive definitions are formalised in these areas; this immediately reveals 
strong relationships with least model and perfect model semantics of logic pro- 
gramming (section 2). I point to fundamental knowledge theoretic problems in 
these formalisms (section 3) and argue that the logic program formalism under 
well-founded semantics provides a superior formalisation (section 4). Section 5 
considers some implications. 



2 Inductive Definitions in mathematics 

One can distinguish between positive inductive definitions and definitions by 
induction on a well-founded set. A prototypical example of a definition by (posi- 
tive) induction is the one of the transitive closure Tr of a graph R. Tr is defined 
inductively as follows. Tr contains an arc from x to y if 

— R contains an arc from x to y; 

— R contains an arc from x to z, and Tr contains an arc from z to y. 

It could be formally represented by the rules: 

^ ( tr{X,Y) ^ graph{X,Y) 

L^trans y) ^ graph{X, Z), tr{Z, Y) 

The intended interpretation of this definition is that the transitive closure is the 
least graph satisfying the implications rather than any graph satisfying the above 
implications. Alternatively, the transitive closure can be obtained in a construc- 
tive way by applying these implications in a bottom up way until saturation. It is 
commonly known that inductive definitions such as the one of transitive closure 
cannot be expressed in FOL, and a fortiori, not in the completion semantics^. 

Typical for the above sort of inductive definition is that the induction is 
positive: i.e. the defined concept depends positively on itself, and hence a unique 

^ A simple counterexample: verify that the unintended interpretation with domain 
{a,b} and I{graph) = {(a, a)} and I{tr) = {{a, a), (a, 6)} satisfies the completion of 
the implications. 
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least relation exists. In definitions by (possibly transfinite) induction on a well- 
founded poset, this is not necessarily the case. In definitions of this kind, a 
concept is defined for a domain element in terms of strictly smaller elements. 
An example is the definition of the ordinal powers of a monotonic operator. A 
simple first order example of such a definition is the definition of even numbers 
in the well-founded poset IN, <. One defines that a natural number n is even by 
induction on <: 

— n = 0 is even; 

— if n is not even then n -I- 1 is even; otherwise n -I- 1 is not even. 

A formal representation of the definition in the form of implications is: 

1 ^ f euen(O) 

even ^ even{s{X)) ^ -^even(X) 

Now the defined predicate even occurs negatively in the body of the rule. Verify 
that in the natural numbers, this theory has infinitely many minimal models^. 

Its semantics can be described by a constructive process and is also expressed 
well by the Clark completed definition of the above implications: 

yX.even(X) ^ A = 0 V 3V.A = s{Y) A -euen(A) 

A more complex example showing the elements of transfinite induction in 
a richer context is the concept of depth of an element in a well-founded poset 
P, < . Define the depth of an element x of P by transfinite induction as the least 
ordinal which is a strict upper-bound of the depths of elements y G P such that 
y <x. 

Formally, let F [X, D] mean that D is a larger ordinal than the depths of all 
elements Y < X: F [X, D\ = 

VV, Dy-{Y < a a depth{Y, Dy) Dy < D) 

Then, depth is represented by the singleton definition T>depth' 

depth{X, Dx) ^ F[X, Dx]A 

[VP>.P[A,P] ~^Dx <D] 

Construction or Clark completion gives the semantics of this definition. The 
defined predicate depth occurs negatively in the body of the rule, and as a 
consequence, multiple unintended minimal models may exist^ . 

One application of this definition is the definition of depth of a tree. Here 
the well-founded poset is the set of trees (with values from a given domain D) 
without infinite branches in a domain; the partial order is the subtree relation. 
For finitely branching trees, the depth is always a natural number; for infinitely 

^ E.g. {euen(O), euen(2), ..} but also {euen(O), euen(f), euen(3), euen(5), ..}. 

® E.g. in the context of IN, <, an unintended minimal model is 
{depth{0, 0), depth{0, l),depth{l, 2), ,.,depth{n, n + 1), ..}. 
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branching trees, the depth may be an infinite ordinal. E.g. the tree with branches 
(0, 1, 2), (0, 2, 3, 4), (0, 3, 4, 5, 6), (0, n, 2n), .. is a tree with depth oo. 

The above two types of inductive definitions require a different sort of seman- 
tics. This raises the question whether a uniform principle of inductive definition 
can be proposed which is correct for all inductive definitions and hence gener- 
alises and integrates completion and minimisation. The first attempt to formalise 
such a principle was in the context of Iterated Inductive Definitions. 

The study of inductive definitions in mathematics has started with Post [19], 
Spector [24] and Kreisel [15]. Important work in this area includes [9,16,18,2,4]. 
An offspring of this research is fixpoint logic, currently used in databases [1]. 
Below is an overview of ideas proposed in the area of Inductive, Iterated Inductive 
Definitions (IID) and fixpoint logics. The overview is an attempt to give a faithful 
and comprehensive presentation of the essential ideas in these areas, while I 
have taken the freedom to reformulate syntax or semantics in order to increase 
uniformity and comprehensibility. 



2.1 Positive Inductive Definitions 

Positive Inductive Definitions have been formalised in various ways. In the style 
of [9], an inductive definition on a given interpretation M is represented as a 
formula: 

p(X)^F[X,p] 

where F[X,p] is a First Order Logic (FOL) formula with only positive occur- 
rences of the defined symbol p but arbitrary occurrences of symbols interpreted 
in M. In fixpoint logic, the relation p would be denoted /VF[A, !?'] (here p is 
replaced by a predicate variable F). 

[2] studies inductive definitions in a abstract representation with an obvious 
correspondence with definite logic programs. A definition on a domain D of 
propositional symbols is represented as a possibly infinite set T> of rules p ^ B 
with p G D, B C . 

[2] gives an overview of three equivalent mathematical principles for describ- 
ing the semantics of a (Positive) inductive definition. They are equivalent with 
the way the least model semantics of definite logic programs can be defined [27] . 

— The model can be defined as the least model of the implications. E.g., in [9], 
this minimal model semantics is expressed through a circumscription-like 
axiom (expressing that p must be the least predicate rather than a minimal 
one). 

— The model can be expressed constructively as the least fixpoint of a Tp-like 
operator associated with the definition. In the presentation of [2], inductive 

^ Definitions represented in the other style can be represented in this abstract way. 
Given the mathematical structure M and formula F[X,p\, define the domain D as 
the set of atoms pix) with x € M". Define T> as the set of rules pix) ^ B for each 
X and each set B of p-atoms such that M |= F[x, B]; meaning that F is true for x 
in M when p is interpreted as the set B. 
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definitions are dually defined as monotonic Tp-like operators. This is the 
common way in fixpoint logic (hence the name). 

— The model can be expressed also as the interpretation in which each atom 
has a proof tree. Also this formalisation has been used in LP in [7] . Because it 
is less commonly used in LP, I present it here for a slightly extended version 
of the formalism of [2] . 

Let be given a symbol domain D, including a subset Do Q D which includes 
the truth values t, f, an interpretation M interpreting the symbols of Do 
such that M(t) = t,M(f) = f. The symbols of Do are called the open or 
interpreted symbols. Also given is a definition T> which is a set of rules p ^ B 
with head p G D\ Do and body B consisting of atoms oi D\ Do and positive 
or negative literals of Do^. The set Defined{T>) = D \ Do is called the set of 
defined symbols, the set of open symbols Do is often denoted Open{V). We 
assume that each symbol p G Defined{T>) has at least one rule p ^ B G V 
(it may be the rule p ^ {f}). Also the body i? of a rule is never empty {B 
may be the singleton {t}). 

A 2?-proof-tree T of p € D is a tree of literals of D with p as root such that: 

• all leaves of T are positive or negative open literals; all non-leaves contain 
defined atoms; 

• for each non-leaf node p with set of immediate descendants B: p ^ B G 

• T is loop-free; i.e. contains no infinite branches. 

— x> 

The model M of 2? given M can be characterised as the set of atoms p G D 

which occur in the root of a proof-tree T such that all leaves are true literals 

in M. Note that interpreted literals have proof-trees consisting of one node; 
— x> 

as a consequence, M extends M. 



2.2 Iterated Inductive Definitions 

The logics of Iterated Inductive Definitions are or can be seen as attempts to 
formalise the mathematical principle of definition by (transfinite) induction on 
a well-founded order. Iterated Inductive definitions were first introduced in [15] 
and later studied in [9] and [16]. [2] formulates the intuition of Iterated Inductive 
Definitions in the following way. Given a mathematical structure M fixing the 
interpretation of the interpreted predicates and function symbols, a positive 
inductive definition V prescribes the interpretation of the defined predicate(s). 
Once the interpretation of the defined symbols p is fixed, M can be extended 

X) 

with these interpretations, yielding a new interpretation M . On top of this 
structure, again new predicates may be defined in the similar way as before. The 
definition of this new predicates may depend negatively on the defined predicates 

® Allowing positive or negative open literals is an extension to the formalism of [2]. 
It does not introduce any complexity because the interpretation of these literals is 
given. This extension will facilitate the leap to inductive definitions with recursion 
over negation. 
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— x> 

p as these are interpreted in M . This principle can be iterated in an arbitrary, 
even transfinite sequence of positive inductive definitions. 

In [2], the abstract definition logic defined there is not explicitly extended 
with this idea, but given the above intuition, the extension with negation is 
straightforward. Given a domain D and mathematical structure M, an Iter- 
ated Inductive Definition (IID) would be a possibly transfinite sequence T> — 
{'Da)a<aT, of positive inductive definitions such that: 

— each defined symbol p is defined in a unique 2?Qp ; we call ttp the stratum of 

V, 

— for each rule p ^ B G T>, for each defined atom q G B, aq < for each 
defined atom q such that ~^q G B, Oq < ttp. 

— T> 

The model M of a definition can be obtained by transfinitely iterating the 
principle of positive inductive definition over the sequence 

There is an obvious correspondence between Iterated Inductive Definitions 
(HD’s) and stratified logic programs under perfect model semantics [3,20,21]. 
Already in 84, [14] defines a semantics for stratified logic programs based on 
the Iterated Inductive Definition (IID) logic defined in [16]. To my knowledge, 
this was really the first time that the perfect model semantics for stratified logic 
programs was defined. Apparently this work stayed largely unnoticed, perhaps 
because, like the semantics in [16], it is based on sequent calculus, which to some 
extend increases the mathematical complexity and obscures the simple intuitions 
underlying this semantics. 

Though the intuition of HD’s as formulated in [2] is straightforward, it is not 
easy to see how this idea is implemented in IID logics such as those of [9] , [4] and 
also in [16]. The reason for this seems as follows. The goal of this research was 
to investigate theoretical expressivity of transfinite forms of HD’s. As explained 
in [4], a definability study makes only sense in a finitely represented logic, while 
transfinite HD’s in the abstract setting above are per definition infinite objects. 
[9] investigates HD’s encoded in an IID-form, a single FOL formula of the form 
F[N, X, P], and expresses its semantics in a circumscription-like second order 
formula. The problem is that this encoding is extremely tedious and this blurs 
the simple intuitions behind this work and the similarities with the perfect model 
semantics. 

Nevertheless, it is interesting -if only from historical perspective- to see how 
transfinite definitions can be encoded finitely as an IID-form and how a perfect 
model-like semantics can be expressed in such a notation. Consider the following 
definition constructed for the sole purpose of illustrating the encoding: 



even 



(0) 


even(O) ^ t 




(n -1- 1) 


even{n -1- 1) ^ 


-ieuen(n) 


(n) 


even{n) ^ euen(n) 


(oo) 


sw ^ euen(n). 


even{n + 


(oo + 1) ok ^ ~^sw 





The symbol sw (which abbreviates something -wrong) represents that two subse- 
quent numbers are even, and ok is its negation. This definition can be stratified 
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(the strata of the defined predicates are given) . The model obtained after oo + 2 
iterations is {ok, even{2n)\n G IN}. 

To encode such an abstract IID, a binary meta-predicate h (of holds) is used: 
h{a,p) means that the stratum of p is a and that p is defined true. The first 
step in encoding such an abstract IID {T>a) yields a possibly infinite disjunction 
F[N, X, P], For any rule p ^ {.., g, r, -is, ..} with aq = ap,ar < ap,as < ctp, 
add one disjunct: 

N = Up f\X— p/\../\ P{q) A h{ar, r) A ~^h{aa, s) A .. 

This disjunct is obtained as a conjunction oi N = ap f\ X = p, corresponding 
to the head p, a conjunct P{q) for any atom q of the same stratum as the head, 
and a literal h{ar,r) and ~^h{as,s) for the other literals r, -is G B defined in 
lower strata®. 

The result is an infinitary formula F[N, X, P], Here, N ranges over the ordi- 
nals a < av, X over atoms and P over sets of atoms. The formula corresponding 
to 'D'^even is the following infinitary disjunction with disjuncts for each 0 < n: 

' N = 0 A X = euen(0)V 

N = n+1AX = even{n -I- 1) A ~^h{n, even{n)) V ... 

N = n A X = even{n) A P{even{n)) V ... 

N = ooAX = swA h{n, even{n))A 

h{n + l,even{n + 1)) V... 

^N = qo+ 1AX = ok A ^h{oo,p) 

There is only one step more to go to reduce this formula to an equivalent 
finite IID-form. But first, we show how to express the semantics of the IID. Two 
axioms express essentially that at each stratum a, the set h{a , .) = {p\h{a,p) is 
true} satisfies the definition Pa- These axioms express the principle of positive 
inductive definition: that this set must satisfy the implications of Pa and that it 
must be contained in each set satisfying the implications. Below, F[P{t) / h{N, r)] 
denotes the formulas obtained by replacing each expression P(r) for arbitrary 
term r by h{N, r). 

The first axiom expresses that for each ordinal a and given h for lower strata, 
h{a,.) satisfies the implications in Pa- 

VN, X.{h{N, X) ^ F[P{T)/h{N, r)]} 

® In [9], literals h{a.q,q) are replaced by open formulas h{aq,q) A aq < N . This open 
formula represents the restriction of h to strata < N (atoms q at higher strata are 
false in h{aq,q) A Oq < N). The resulting, more complex axioms can be seen to be 
equivalent with our axioms for IID-forms obtained from a stratified abstract IID P. 
The reason for this choice seems to be that the stratification condition, which can 
be defined nicely for abstract IID’s, cannot easily be formulated directly for IID- 
forms. The more complex axioms determine a unique h predicate even if F[N, X, P] 
encodes a nonstratifiable or incorrectly stratified definition (but the semantics may 
be unnatural then); in that case, our simpler axioms do not determine a unique 
li-predicate due to mutual dependencies between predicates defined at lower and at 
higher level. 
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One can verify that if one assigns the values ap to N and p to X and elimi- 
nates false disjuncts, then this complex formula reduces to: 

h{ap,p) <— .. A h{aq, q) A -•h{ar, r) A .. V 

with a disjunct for each p ^ q, ->r, ..} € T>. 

The second axiom expresses that for each ordinal a, h(a, .) is contained in 
each set tp which satisfies the implications of T>a. It is a second order axiom, 
using a set variable which ranges over sets of atoms and it is a variant of a 
circumscription axiom: 

ViV.ViP'.[VX!f(X) ^ ^ [yX.h{N,X) IJ'(X)] 

Finally, the infinitary IID-form F should be further encoded by a finite for- 
mula. This involves: 

— encoding ordinals by a (primitive recursive) well-ordering on natural num- 
bers. E.g. the total order 2^3A..^0Alisa well-ordering encoding the 
ordinals 0, 1, .., oo, oo -I- 1. 

— encoding atoms by natural numbers: an obvious proposal here is to encode 
each atom by the natural number encoding the stratum of the atom; i.e. 
euen(n) by n -I- 2, sw by 0 and ok by 1. 

— encoding tuples of natural numbers by natural numbers. Details of this are 
tedious and irrelevant for this paper; we omit them. 

In this encoding, an infinite number of disjuncts can be represented in a 
finite formula using quantification in the natural numbers. The different sets of 
disjuncts are encoded as follows: 

{N = OAX = euen(O)} 

— » N = 2AX = 2 

{N = n -I- 1 A X = even{n -I- 1) A -^h{n, even{n)) \ n e IN} 

— » 3M.N = M +IA2< M AX = N A^h{M,M) 

{N = n A X = even{n) A P{even{n)) \ n G IN} 

— » 2 <N AX = NAP{N) 

{X = 00 A X = sw A h{n, even{n)) A h{n + I, even{n -1-1)) | n G IN} 

— » 1V = 0AX = 0A3M.[2<MA h{M, M) A h{M +l,M+l)] 

{N = oo+ 1AX = ok A ~^h{oo,p)} 

— > 1V= 1AX= lA-/i(0,0) 

The resulting finite IID-form is: 

' N = 2 AX = 2V 

3M.[N = M +1A2 < M AX = N A ^h{M, M)]V 
< 2< N AX = N AP{N)y 

N = t) A X = t) A3M.[2 < M A h{M, M) A h{M +l,M+l)]y 
fV = 1AX= lA-h(0,0) 
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2.3 Inflationary Fixed-point Logic 

[2] proposes another extension of positive inductive definitions with negation. 
With an arbitrary formula F\X^ P] with negative occurrences of P allowed, the 
resulting Tp-like operator pj is not monotonic and may not have a least 
fixpoint. However, the operator (J) = /UTpj^ pj (/) is increasing (though 

not monotonic) and therefore a fixpoint can be constructed. This idea has been 
used in fixpoint logic with inflationary semantics [1] . 

Inflationary fixpoint logic is known to be expressive; however, it is not a nat- 
ural formalisation of inductive definitions over a well-founded set, and therefore, 
this extension is not relevant in the context of this paper. For example, if we 
construct a formula Feven [X, even] for in the same way as for positive 

inductive definitions, we obtain: X = 0 V 3Y.X = s(T) A ^even{Y). 

After one application of the inflationary fixpoint operator, the unintended fix- 
point {euen(n)|n G IN} is obtained. 

3 A critique on Iterated Inductive Definitions 

The stratified IID formalisms provide a correct treatment of inductive definitions 
with negation. The IID-forms as defined in e.g. [9] was not intended for use for 
Knowledge Representation and is absolutely unsuitable for such purpose. But 
any stratified formalism for inductive definitions with negation will pose certain 
fundamental problems. 

(I) A stratification of a definition does not provide any information about the 
defined relations. This can be seen from the fact that choosing another stratifica- 
tion for a definition has no impact on its semantics; moreover, there exists ways 
to construct the semantics of an IID without recurring to a predefined syntacti- 
cal stratification. It is undesirable that in HD’s, a stratification must be chosen 
and this choice is explicitly reflected in the representation of the definition. 

(II) The stratification of an Iterated Inductive Definition is based on a syntactical 

criterion. As a consequence, a rule set formulated for one alphabet may be strat- 
ifiable whereas the corresponding rule set in a linguistic variant of the alphabet 
may be non-stratifiable. The following variant of the definition illustrates 

this. Assume that we use the alphabet: {even{n), successor{n,m)\n,m G IN} 
with a predicate representation of the concept of successor. In this alphabet, the 
natural representation of the inductive definition of even is the set with for each 
n, m G IN the following rules: 

{ successor {n -I- 1, n) 
euen(O) 

euen(n) <— successor{n,m),^even{m) 

This variant definition cannot be stratified due to the presence of rules even{m) 
<— successor{m, to), ~^even{m). A good formalisation should not be as dependent 
of intuitively innocent linguistic variance. 
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(III) As a formalisation of inductive definitions on well-founded posets, the re- 
quirement of stratified HD’s of an explicit stratification is problematic in general. 
A definition of a concept (like evenness or depth) for x in terms of all y < a: is 
mathematically well-constructed; yet a stratification for such a definition may be 
in general unknown. As an example, consider the inductive definition of depth 
of an element in a well-founded order or the depth of a tree. The need of an 
explicit stratification is unnecessary and unnatural. 



3.1 WFS: An improved Principle of Inductive Definition 

In this section, I argue that the mathematics of (a variant of) the well-founded 
semantics of logic programming [28] provides an improved formalisation of the 
principle of inductive definition. 

— T> 

Just like the perfect model, the model M of a stratified Iterated Induc- 
tive Definition T> is obtained by iterating the positive induction principle and 
constructing a sequence {Ma)a<ar> of interpretations of increasing sub-domains 

which starts with M and gives gradually better approximations of the model 
— x> 

M . Each Ma defines the truth value of all symbols of the sub-alphabet 
and leaves atoms defined at later levels undefined. The role of the stratification 
in this process is to delay the use of some part of the definition until enough 
information is available to safely apply the positive induction principle on that 
part of the definition. 

The same ideas can be implemented in a different way, without relying on an 
explicit syntactical partitioning of the definition. Instead of using 2-valued inter- 
pretations of sub-alphabets, partial interpretations can be used. Here, a partial 
interpretation is a partial function from the set of atoms D to {t, f}. Equiva- 
lently, we use the classical formalisation as a total function from the set of atoms 
D to {t, u, f}^. The positive induction principle can be conservatively extended 
for definitions with negation. For a definition T>, we define the Positive Induc- 
tion Operator VTx> which takes as input a partial interpretation I representing 
well-defined truth values for a subset of atoms, and derives an extended partial 
interpretation defining the truth values of other atoms that can be derived by 
positive induction. Definition of truth values of atoms for which not enough in- 
formation is available is delayed. The model of a definition is obtained then by 
a fixpoint construction. 

From a knowledge theoretic point of view, the key problem in the above 
enterprise is the definition of the principle of positive induction in the context 
of definitions with negation. A formalisation based on proof-trees shows most 
clearly the structural similarities between positive induction for PID’s and for 
inductive definitions with negation. 

This formalisation is mathematically equivalent with the previous one, is more com- 
mon and leads to more elegant mathematics. Note that in this view, u plays a similar 
role as null-values in databases: just as a null value, u is not a real truth value, it is 
a place holder for an (as yet) undefined truth value. Below, I return to the issue of 
interpretation of u. 
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We formalise the above ideas for a formalism which is the natural extension of 
the abstract definitions of [2] with negation; at the same time, it is an infinitary 
version of the propositional LP-formalism. Given is a domain D of propositional 
symbols. In the new, more general setting, a definition T> consists of rules in which 
positive and negative open or defined literals may appear in the (nonempty) 
body. As before, the set of defined symbols that appear in the head of a rule 
is denoted Defined{T>); the set of open or interpreted symbols is denoted as 
Open{T>). Also given is an interpretation M of the open symbols Open{T>). 

The definition of a 2?-proof-tree T as defined in section 2.1 hardly needs to 
be altered: it is a tree of literals of D such that: 

• leaves contain open literals or negative defined literals; non-leaves contain 
defined atoms p G Defined{T>)-, 

• each non-leaf p has a set of direct descendants B such that p ^ B G T>; 

• no infinite branches. 

Hence, leaves contain interpreted literals and negations of defined atoms. Note 
that interpreted atoms have proof-trees consisting of one root node. 

Definition 1. The Positive Induction Operator VI-d maps partial interpreta- 
tions I to r such that ^p G D: 

— P{p) = t if p has a proof-tree with all leaves true in I. 

— I' (p) = f if each proof-tree of p has a false leaf in I ; 

— I'{p) = u otherwise, i.e. no proof-tree of p has only true leaves but there 
exists at least one without false leaves. 

The Positive Induction Operator is a monotonic operator w.r.t. the precision 
order <p, the point-wise extension of u <p f, u <p t. Monotonic operators w.r.t. 
<p have a least fixpoint [10]. Hence, each interpretation M of the non-defined 
symbols can be extended to a unique least fixpoint VI-d^ (M). 

Definition 2. "PTut (M) is the model of T>. 

The structural resemblance between positive induction in PID’s and in VPt> 
is apparent. There are some important properties. The first relates this semantics 
to WPS semantics of logic programs. 

Proposition 3. VX-d and the 3-valued stable model operator [22] are identical. 

— x> 

The well-founded model ofV is the model M ofV. 

Second, this semantics provides a conservative extension of the HD-style 
semantics, as the WPS is known to generalise least model semantics and perfect 
model semantics of stratified logic programs. 

Third, certain definitions may have partial models (e.g. {p ^ ^p}. Note here 
the changing role of u during the fixpoint computation and in the fixpoint. When 
the truth value of an atom is u at some stage of the fixpoint computation, it 
means that the truth value of the atom is yet undetermined at this stage. If its 
truth value is still u in the fixpoint, it means T> does not allow to constructively 
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define the truth value of p. Hence, undefined atoms in the fixpoint point to 
ambiguities in the definition. 

There seem to be two sensible treatments of ambiguous definitions. They 
could be considered as inconsistent, in a similar sense as in classical logic: am- 
biguous definitions have no models. In this strict view, Definition 2 is to be 
refined as: 



Definition 4. If "PX-pt (M) is 2-valued, then it is the model of T>; other- 
wise, T> has no model. 

The result is a 2-valued logic. This is a simple strategy because it avoids potential 
problems with 3-valued models but it has the disadvantage that no sensible 
information can be extracted from an ambiguous definition since such a definition 
entails every formula. This situation is analogous to classical logic. 

The more permissive treatment is to allow definitions with partial models. 
The result is a sort of paraconsistent definition logic, i.e. a logic in which def- 
initions with local inconsistencies or local ambiguities do not not entail every 
formula. 



4 Concluding remarks 

This paper is a study of the concept of (transfinite) inductive definition. The pa- 
per investigates how this concept has been formalised in the past in the ID and 
IID areas; drawbacks of these formalisations were pointed at and an improved 
formalisation, inspired by logic programming semantics, is proposed. Strong con- 
nections between the formalisations in ID and IID and perfect model semantics 
but also circumscription semantics have been exposed. 

This study is not only relevant as a study of inductive definitions but improves 
also our understanding of the use of LP for knowledge representation and hence, 
of the role of LP in Artificial Intelligence. The reading of logic programs as auto- 
epistemic or default theories on the one hand, and as definitions on the other 
hand, give essentially different perspectives on the meaning of logic programs, 
on the nature of the negation symbol and the implication symbol in LP. 

In general, a knowledge theoretic study as the one in this paper is relevant 
for developing a knowledge representation methodology. It is (or once was) a 
widespread view that the advantage of declarative logic for “encoding” knowl- 
edge is in its intuitive linguistic reading; in the case of this paper: the reading 
of a set of rules as an inductive definition. This reading of the logic provides 
the methodological basis for knowledge representation; the tight connection be- 
tween formal syntax and semantics and a clear intuitive reading facilitates the 
explicitation of the expert knowledge. Formulas of the theory can be understood 
by the experts through the linguistic interpretation, without the need of explic- 
itly constructing the formal semantics. Knowledge theoretic studies like the one 
in this paper, are important to build natural and systematic methodologies for 
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knowledge representation. One aim of this study was to clarify how logic pro- 
grams can be used for knowledge representation and what sort of knowledge can 
be represented in it. 

A simple illustration of the impact of the linguistic interpretation on knowl- 
edge methodology is as follows. The definition that dead means not alive, is 
naturally expressed in LP under the definition reading by the singleton defini- 
tion: 

{dead ^ ^alive} 

On the other hand, in Extended Logic Programming [13], which is based on the 
default and AEL view, a correct representation would be: 

dead ^ ^alive 

-^dead ^ alive 

A knowledge theoretic study is also relevant for the design or extension of a 
logic. This is also well-illustrated in the case of LP. With respect to knowledge 
representation, a major problem of LP under the default or auto-epistemic view 
is that no definite negative information can be represented. This led Gelfond 
and Lifschitz in [13] to extend the formalism and re-introduce a form of classical 
negation in Extended Logic Programming. 

In the definition view, a logic program entails plenty of definite negative infor- 
mation. As a matter of fact, the problem with standard LP is the strength of its 
closure mechanism: an atom is assumed false unless it can be proven to be true. 
As a consequence, representing uncertainty is a serious problem; this problem 
has received a lot of attention in recent years. In the definition view on standard 
LP, the problem is because all predicates are defined, have a (possibly empty) 
definition. Hence, the natural idea is to extend the logic with open predicates 
which have arbitrary interpretation. In [6], this idea was elaborated in an exten- 
sion of LP, called Open Logic Programming (OLP). I argued there that OLP 
provides a knowledge theoretic interpretation of Abductive Logic Programming 
as a definition logic and that abductive solvers (e.g. SLDNFA [8]) designed for 
this formalism can be seen as special purpose reasoners on definitions for abduc- 
tion and deduction®. A problem of this work is that it is based on completion 
semantics; completion is not a good formalisation of induction. To extend this 
study for the semantics defined in this paper is future work. 

The knowledge theoretic interpretation of LP as inductive definitions gives 
also insight on the relationship with a class of logics outside the area of NMR: 
definition logics. This class includes fixpoint logics and description logics. In [25], 
Van Belleghem et al. showed a strong correspondence between OLP-FOL and 

® Note that LP and OLP as definition logics do not provide default negation; as I ar- 
gued in [6] , OLP is not a natural formalism to express some sorts of default reasoning 
problems snch as the well-known train crossing example [13] [17]. In order to rep- 
resent this sort of domains, an antoepistemic modal operator or a default negation 
operator should be added to definition logic. 
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description logics. To large extend, description logic can be considered as a non- 
recursive subformalism of OLP-FOL. There is correspondence on the intuitive 
and semantical level; the differences on the syntactic level are syntactic sugar. 
The specific syntactic restrictions of description logics have allowed to develop 
highly efficient reasoning techniques. 

Also subject for future work is to substantiate the claim in the introduction, 
that a broad class of human knowledges in many areas of human expertise, rang- 
ing from common sense knowledge situations to mathematics, is of constructive 
nature, in the sense that (part of) the knowledge is present in the form of a 
recursive recipe, to be interpreted as defined in this paper. The prominent roles 
of completion and circumscriptive techniques in NMR and knowledge represen- 
tation hint at this. 
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Abstract. We propose a logic-based language for programming agents 
that can reason about their own beliefs as well as the beliefs of other 
agents and can communicate with each other. The agents can be reactive, 
rational/deliberative or hybrid, combining both reactive and rational 
behaviour. We illustrate the language by means of examples. 



1 Introduction 

Kowalski & Sadri [12] propose an approach to agents within an extended logic 
programming framework. In the remainder of the paper we will refer to their 
agents as KS-agents. KS-agents are hybrid in that they exhibit both rational (or 
deliberative) and reactive behaviour. The reasoning core of KS-agents is a proof 
procedure that combines forward and backward reasoning. Backward reasoning 
is used primarily for planning, problem solving and other deliberative activities. 
Forward reasoning is used primarily for reactivity to the environment, possibly 
including other agents. The proof procedure is executed within an observe-think- 
act cycle that allows the agent to be alert to the environment and react to it 
as well as think and devise plans. Both the proof procedure and the KS-agent 
architecture can deal with temporal information. The proof procedure (IFF proof 
procedure [9]) treats both inputs from the environment and agents’ actions as 
abducibles (hypotheses). 

Barklund et al. [1] and Costantini et al. [6] present an extension to the Reflec- 
tive Prolog programming paradigm [7,8] to model agents that are introspective 
and that communicate with each other. We will refer to this approach as Reflec- 
tive Prolog with Communication (RPC). In RPC introspection is achieved via 
the meta-predicate solve and communication is achieved via the meta-predicates 
tell and told. A communication act is triggered every time an agent agenti has a 
goal of the form agenti ■ told{agent 2 , A). This stands both for “agenti is told by 
agent 2 that A” as well as “agenti asks agent 2 whether A” . This goal is solved 
by agent 2 telling agenti that A, that is, agent 2 ■ tell{agenti, A), standing for 
“agent 2 tells agenti that A” . The information is passed from agent 2 to agenti 
by eventually instantiating A. A main limitation of this approach is that agents 
cannot tell anything to other agents unless explicitly asked. 
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The ability to provide agents with some sort of “proactive” communication 
primitive is widely documented in literature [18,19,20,5,14,17]. For example, one 
can model agents that advertise their services so that other agents, possibly with 
the help of mediators, can find agents that provide services for them. 

An interesting application of agents is that of a virtual marketplace on the 
Web where users create autonomous agents to buy and sell goods on their behalf. 
Chavez & Maes [3], for example, propose a marketplace, where users can create 
selling and buying agents by giving them a description of the item they want to 
sell or buy. The main goal of the approach is to help users in the negotiations 
between buyers and sellers, and to sell the goods better (i.e., at a higher price) 
than the user would be able to otherwise, by taking advantage of their processing 
speed and communication bandwidth. Chavez & Maes’ agents are {i) proactive: 
“. . . they try to sell themselves, by going into a marketplace, contacting interested 
parties (namely, buying agents) and negotiating with them to find the best deal” , 
and (ii) autonomous: “. . . once released into the marketplace, they negotiate and 
make decisions on their own, without requiring user intervention” . Chavez & 
Maes point out their agents’ lack for rationality: “Our experiment demonstrated 
the need and desire for ‘smarter’ agents whose decision making processes more 
closely mimic those of people and which can be directed at a more abstract, 
motivational level.” 

In this paper we propose a combination of (a version of) the RPC program- 
ming paradigm and KS-agents. In the resulting framework reactive, rational or 
hybrid agents can reason about their own beliefs as well as the beliefs of other 
agents and can communicate proactively with each other. In such a framework, 
the agents’ behaviour can be regulated by condition-action rules such as: if I am 
asked by another, friendly agent about something and I can prove it from my 
beliefs then I will tell the agent about it. 

In the proposed approach, the two primitives for communication in RPC, tell 
and told, are treated as abducibles within the cycle of the KS-agent architecture. 

The remainder of the paper is structured as follows. In section 2 we give 
some basic definitions (in particular, we define ordinary and abductive logic 
programs) and we review the IFF proof procedure. In section 3 we review the 
KS-agent architecture. In section 4 we review RPC. In sections 5 and 6 we 
define our approach to introspection and communication in agents, respectively. 
In section 7 we illustrate the framework by means of an example. In section 8 
we conclude and discuss future work. 

2 Preliminaries 

2.1 Basic definitions 

A logic program is a set of clauses of the form: 

A < — Li A ... A Lji {ji A 0) 

where every (1 < f < n) is a literal, A is an atom and all variables are 
implicitly universally quantified, with scope the entire clause. If A = p{t), with 
t a vector of terms, then the clause is said to define p. 
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The completion of a predicate p [4] defined in a given logic program P 
by the set of clauses: 

p{ti) ^ Di ... p{tk) ^ Dk (fc > 1) 

is the iff-definition: 

p{x) ^ [x = ti A Di] V ... V [x = tk A Dk] 
where a: is a vector of variables, all implicitly universally quantified, with scope 
the entire iff-definition. Any variable in a disjunct Di which is not in x is im- 
plicitly existentially quantified, with scope the disjunct. 

If p is not defined in P, then the completion of p is the iff-definition p{x) <-> 
false. 

The selective completion compg{P) of a logic program P with respect to 
a set S of predicates of the language of P is the union of the completions of 
all predicates in S. Note that the completion of a logic program P [4] is the 
selective completion of P with respect to all predicates of the language of P, 
together with Clark’s equality theory [4]. 

An integrity constraint is an implication of the form: 

L\ A . . . A Ln A (n > 0) 

where Li, . . . , L„ are literals and A is an atom, possibly false. All variables in an 
integrity constraint are implicitly universally quantified, with scope the entire 
integrity constraint. 

An abductive logic program [10] is a triple {P,A,I), where P is a logic 
program, A a set of predicates in the language of P, and I a set of integrity 
constraints. The predicates in A are referred to as the abducible predicates 
and the atoms built from the abducible predicates are referred to as abducibles. 
A is the set of non-abducible predicates in the language of P. 

Without loss of generality (see [10]), we can assume that abducible predicates 
have no definitions in P. Abducibles can be thought of as hypotheses that can 
be used to extend the given logic program in order to provide an “explanation” 
for given queries (or observations). Explanations are required to “satisfy” the 
integrity constraints. Different notions of explanation and satisfaction have been 
used in the literature. The simplest notion of satisfaction is consistency of the 
explanation with the program and the integrity constraints. 

In the sequel, constant, function and predicate symbols may be written as 
any sequence of characters in typewriter style. Variables may be written as any 
sequence of characters in italic style starting with a lower-case character. Thus, 
for example, buys may be a predicate symbol, Tom may be a constant symbol, 
while x and city are variables. 

Example 1. Let the abductive logic program {P,A,I) be given as follows 

{ has{x,y) ^ huys{x,y) 
ha.s{x,y) ^ stea.ls{x,y) 
honest(Tom) 

A = { buys, steals } 

/ = { honest(x) A steals(x, y) false } . 
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Then, given the observation G = has(Tom, computer), the set of abducibles 
{buys(Tom, computer)} is an explanation for G, satisfying J, whereas the set 
{steals(Tom, computer)} is not, because it is inconsistent with I. 



2.2 The IFF proof procedure 

The IFF proof procedure [9] is a rewriting procedure, consisting of a number of 
inference rules, each of which replaces a formula by one which is equivalent to 
it in a theory T of iff-definitions. We assume T = comp^{P), for some given 
abductive logic program (P, A, I) . The basic inference rules are: ^ 

1. Unfolding: given an atom p(t) and a definition 

p(a;) ^ £>1 V ... V in T, 

p(t) is replaced by (Pi V ... V P„)0, where 9 is the substitution {x/t}. 

2. Propagation: given an atom p(s) and an integrity constraint 

Li A . . .p(t) . . . A, 

a new integrity constraint Li A . . . t = s . . . A ^ A is added. 

3. Logical simplification: 

[P V C] A P is replaced by [P A P] V [C A P] (splitting) 

not A A B ^ G is replaced by P ^ C V Al (negation elimination) 

P A false is replaced by false, P V false is replaced by P, and so on. 

4. Equality rewriting: applies equality rewrite rules (see [9]) simulating the 
unification algorithm of [16] and the application of substitutions. 

Given an initial goal G (a conjunction of literals, whose variables are free), 
a derivation for G is a sequence of formulae Fi — G A I, . . . , Fm such that each 
derived goal Pi+i is obtained from Fi by applying one of the inference rules, 
as follows: unfolding - to atoms that are either conjuncts in Fi or conjuncts in 
bodies of integrity constraints in P^; propagation - to atoms that are conjuncts in 
Fi and integrity constraints in P^; equality rewriting and logical simplification. 
Every negative literal not A as a conjunct in the initial goal as well as in any 
derived goal is rewritten as an integrity constraint A ^ false. 

Every derivation relies upon some control strategy. Some strategies are prefer- 
able to others. E.g., splitting should always be postponed as long as possible, 
because it is an explosive operation. 

Let Pi = GAI, . . . ,Fn = NVRest be a derivation for G such that N ^ false, 
N is some conjunction of literals and integrity constraints, and no inference step 
can be applied to N which has not already been applied earlier in the derivation. 
Then, Pi, . . . , P„ is a successful derivation. An answer extracted from N 
is a pair {V, a) such that 

— ct' is a substitution replacing all free and existentially quantified variables in 
N by variable-free terms in the underlying language and 

<j' satisfies all equalities and disequalities in P, 

— P is the set of all abducible atoms that are conjuncts in No' and 
cr is the restriction of <j' to the variables in G. 

^ The full IFF proof procedure includes two additional inference rules, case analysis 
and factoring. In this paper, we omit these rules for simplicity. 
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The full proof procedure is proven sound and complete with respect to a given 
semantics [9], based on Kunen’s three- valued completion semantics. 

Example 2. Let {P,A,I) be the abductive logic program in example 1. Then, 
comp^{P) is 

J has(x,?/) <-!■ buys(x,?/) V steals(x, y) 1 
1 honest(a:) <-> x = Tom j ’ 

The following is a (successful) derivation for the goal G — has(Tom, computer): 
Fi = G A / 

F2 = G A [x = Tom A steals(x, y) false] (by unfolding) 

F3 = G A [steals(Tom, y) ^ false] (by equality rewriting) 

F4 = [buys(Tom, computer) V steals(Tom, computer)] A (by unfolding) 
[steals(Tom, y) false] 

T5 = [buys(Tom, computer) A [steals(Tom, y) ^ false]] V (by splitting) 
[steals(Tom, computer) A [steals(Tom, y) false]]. 

The answer {T> — {buys(Tom, computer)}, cr = {}) can be extracted from the 
first disjunct of F5. An additional successful derivation is Fi, . . . , Fg with 

Fg = [buys(Tom, computer) A[steals(Tom, y) false]] V 
[steals(Tom, computer) A[steals(Tom, y) false] 

A[y = computer ^ false]] (by propagation) 
Fy = [buys(Tom, computer) A [steals(Tom, y) false]] V 

[steals(Tom, computer) A [steals(Tom, y) false] A false]] 

(by equality rewriting) 

Fg = [buys(Tom, computer) A [steals(Tom, y) false]] V false 

(by logical simplification) 

Fg = [buys(Tom, computer) A [steals(Tom, y) false]] 

(by logical simplification) 

from which the same answer (F, cr) as above can be extracted. 

Note that in this example, unfolding and equality rewriting in the conditions of 
the integrity constraint are performed before any other operation. In general, 
many of the operations involving the integrity constraints can be done at com- 
pile time, and the simplified (more efficient) version of the integrity constraint 
conjoined to the initial goal. 

3 Kowalski-Sadri agents 

Every KS-agent can be thought of as an abductive logic program, equipped with 
an initial goal. The abducibles are actions to be executed as well as observa- 
tions to be performed. Updates, observations, requests and queries are treated 
uniformly as goals. The abductive logic program can be a temporal theory. For 
example, the event calculus [13] can be written as an abductive logic program: 
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holds^t(p, ^2) ^ happens(e, ti) A (ti < t2)/\ 

initiates(e,p) A notbroken(ti,p, 12) 



broken(ti,p, t2) ^ happens(e, t) A terminates(e,p) A (ti < t < 12)- 

The first clause expresses that a property p holds at some time t2 if it is initiated 
by an event e at some earlier time t\ and is not broken (i.e. persists) from t\ to 
t2- The second clause expresses that a property p is broken (i.e. does not persist) 
from a time t\ to a later time t2 if an event e that terminates p happens at a 
time t between t\ and t2- 

The predicate happens is abducible, and can be used to represent both ob- 
servations, as events that have taken place in the past, or events scheduled to 
take place in the future. An integrity constraint 

I\ happens(e, t) A preconditions(e, t,p) A not holds^t(p, t) false 

expresses that an event e cannot happen at a time t if the preconditions p of e 
do not hold at time t. 

The predicates preconditions, initiates and terminates have application- 
specific definitions, e.g. 

preconditions(carry .umbrella, t,p) <— p = own.umbrella 
preconditions(carry .umbrella, t,p) <— p = borrowed.umbrella 



initiates(rain, raining) 



terminates(sun, raining). 

Additional integrity constraints might be given to represent reactive behaviour 
of intelligent agents, e.g. 

I2 happens(raining, t) happens(carry .umbrella, t -|- 1 ) 

or to prevent concurrent execution of actions (events) 

happens(ei, t) A happens(e2, t) Ci = 62- 

The basic “engine” of a KS-agent is the IFF proof procedure, executed via the 
following cycle: 
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To cycle at time t, 

(i) observe any input at time t, 

(ii) record any such input, 

(Hi) resume the IFF procedure by propagating the inputs, 

(iv) continue applying the IFF procedure, 

using for steps (Hi) and (iv) a total of r units of time, 

(u) select an atomic action which can be executed at time t + r + 2, 
(vi) execute the selected action at time t + r + 2 and record the result, 
{vii) cycle at time t + r + 3. 



The cycle starts at time t by observing and recording any inputs from the envi- 
ronment (steps {i) and {ii)). Steps {i) and {ii) are assumed to take one unit of 
time each. Then, the proof procedure is applied for r units of time (steps (in) 
and (iv)). The amount of resources r available in steps (in) and (iv) is bounded 
by some predefined amount n. By decreasing n the agent is more reactive, by 
increasing n the agent is more rational. Propagation is applied first (step (Hi)), 
in order to allow for an appropriate reaction to the inputs. Afterwards, an ac- 
tion is selected and executed, taking care of recording the result (steps (u) and 
(vi)). Steps (u) and (vi) conjoined are assumed to take one unit of time. Selected 
actions can be thought of as outputs into the environment, and observations as 
inputs from the environment. From every agent’s viewpoint, the environment 
contains all other agents. 

Selected actions correspond to abducibles in an answer extracted from a dis- 
junct in a derived goal in a derivation. The disjunct represents an intention, i.e. 
a (possibly partial) plan executed in stages. A sensible action selection strategy 
may select actions from the same disjunct (intention) at different iterations of 
the cycle. Failure of a selected plan is obtained via logical simplification, after 
having propagated false into the selected disjunct. 

Actions that are generated in an intention may have times associated with 
them. The times may be absolute, for example happens(ring_bell, 3), or may 
be within a constrained range, for example happens(stepjforward, t) A (1 < t < 
10). In step (u), the selected action will either have an absolute time equal to 
t -|- r -f 2 or a time range compatible with an execution time at t -I- r -f 2. In the 
latter case, recording of the result of the execution instantiates the time of the 
action. 

Integrity constraints provide a mechanism not only for constraining explana- 
tions and plans, for example, as in Ji, but also for allowing reactive, condition- 
action type of behaviour, for example, as in l 2 - 

4 Reflective Prolog with communication 

4.1 Reflective Prolog 

Reflective Prolog (RP) [7,8] is a metalogic programming language that extends 
the language of Horn clauses [11,15] to include higher-order-like features. 
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The language is that of Horn clauses except that terms are defined differently 
in order to include names that are intended to represent at the meta-level the 
expressions of the language itself. The alphabet of RP differs from the usual al- 
phabet of Horn clauses by making a distinction between variables and metavari- 
ables and by the presence of metaeonstants in RP. Metavariables can only be 
substituted with names of sentences of RP, and metaconstants are intended as 
names for constants, function and predicate symbols of RP. If c is a constant, a 
function or a predicate symbol, then we write 'c as a convenient way to represent 
the metaconstant that names c. Similarly, if 'c is a metaconstant, then its name 
is "c, and so on. 

Furthermore, the alphabet of RP contains the (unary) predicate symbol 
solve. This allows us to extend at the meta-level the intended meaning of pred- 
icates (partially) defined at the object-level. Clauses defining solve will be re- 
ferred to as meta-level clauses, and clauses defining object-level predicates will 
be referred to as object-level clauses. Metavariables are written as any sequences 
of characters in italic style starting in the upper-case. 

Compound terms and atoms are represented at the meta-level as name terms. 
For example, the name of the term f(a, x) is the name term 'f('a,T), where 'f 
and 'a are the metaconstants that name the function symbol f and constant a, 
respectively, and 'x stands for the name of the value of the variable x. 

The intended connection between the object-level and the meta-level of RP 
is obtained by means of the following (inter-level) refiection axioms: 

A ^ solve(M) and solve(M) ^ A, for all atoms A. 

The first asserts that whenever an atom of the form solve('/l) is provable at the 
meta-level, then A is provable at the object-level. These axiom schemata are not 
explicitly present in any given program but rather they are simulated within the 
modified SLD-resolution underlying RP. 

Suppose, for example, that we want to express the fact that an object obj 
satisfies all the relations in a given class. If we want to formalise that statement 
at the object-level, then for every predicate q in class we have to write the clause 
q(obj). Instead, we may formalise our statement at the meta-level as: 
solve(P(bbj)) <— belongs_to(P, class), 
where P is a metavariable ranging over the names of predicate symbols. 



4.2 Communication 

RPC is an extension of Refiective Prolog to accommodate communication be- 
tween agents [ 7 , 8 ]. Agents are seen as logic programs. In an agent setting, 
solve may be seen as representing a given agent’s beliefs, for example agentj^ : 
solve(M), for some atom A, stands for “agent believes A”. RPC allows for 
two communication acts, expressed via the two meta-predicates tell(A, Y) and 
told(A, Y). An atom tell('agent2,M) in the theory representing agentj^ stands 
for “agent]^ tells agent2 that A holds”. An atom told('agent2,M) in the theory 
representing agent stands for “agentj^ is told by agent2 that A holds”. The 
latter atom can also be interpreted as “agent asks agent2 whether A holds”, 
i.e. the predicate told can be used to express queries of agents to other agents. 
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Communication acts are formalized in RPC by means of (inter-theory) re- 
flection axioms based on the predicate symbols tell and told: 

[agentj^ : told('agent2/-A)] [agent2 : tell('agent]^,d)] 
for all atoms A, and agent names 'agentj^ and 'agent 2 - The intuitive meaning of 
each such axiom is that every time an atom of the form tell('agent]^,'A) can be 
derived from a theory agent 2 (which means that agent 2 wants to communicate 
proposition A to agentj^), the atom told('agent2,Cl) is consequently derived 
in the theory agentj^ (which means that proposition A becomes available to 
agentj^). These axioms are not explicitly present in any given program but rather 
they are simulated within the modified SLD-resolution underlying RPC. 

These two predicates are intended to model the simplest and most neutral 
form of communication among agents, with no implication about provability (or 
truth) of what is communicated, and no commitment about how much of its 
information an agent communicates and to whom. An agent may communicate 
to another agent everything it can derive (in its associated theory), or only part 
of what it can derive, or it may even lie, that is, it communicates something it 
cannot derive. 

The intended connection between tell and told is that an agent may receive 
from another agent (by means of told) only the information the second agent 
has explicitly addressed to it (by means of tell). Thus, an agent can regulate 
its interaction with other agents by means of appropriate clauses defining the 
predicate tell. 

What use an agent makes of any information given to it by others is entirely 
up to the agent itself. Thus, the way an agent communicates with others is not 
hard- wired in the language. Rather, it is possible to define in a program different 
behaviours for different agents or different behaviours of one agent in different 
situations. (Several examples of the use of agents in RPC can be found in [1,6].) 
For example, the following clauses: 

Bob : [solve(A) reliable(A) Atold(A, A)] 

Bob : [solve(A) ^ told('John, 'not X)] 
express that an agent Bob trusts every agent that is reliable but distrusts John. 
The clause: 

Bob : [tell(A, A) <— agent(A) A solve(A)] 
says that the agent Bob tells the others whatever he can prove. 

RP and its extension RPC rely upon a Prolog-style control strategy and 
therefore do not allow for agents to tell other agents or ask for information 
proactively. 



5 Adding introspection to KS-agents 

In this paper, following the framework of KS-agents, agents are represented as 
abductive logic programs rather than ordinary logic programs as in RPC. The 
abductive logic program, in its completed form, is used embedded within a KS- 
agents’ cycle. 
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In order to accommodate within agents beliefs about different agents, we give 
solve an additional argument rather than using labels as in RPC. The atom 
solve(R(jient, A) stands for Agent believes A” . We will assume that solve can 
only take names of atoms as second argument. By convention, solve(A) will be 
an abbreviation for so\Ye{Agent, A) within the program P of Agent itself. 

Instead of incorporating the reffection principle linking object- and meta- 
level within the proof procedure, as in RP, we incorporate it via (a finite set of) 
clauses to be added to the given (abductive) logic program. (A similar approach 
is presented in [2].) 

Let (P, A, I) be an abductive logic program and let C = P <— P be a clause 
in P. Then, we define a reffection principle TZP: 

, . _ / if C is an object-level clause 

[ (C) if C is a meta-level clause 



where 

• If P = p(t), then 

Cl(C') = { solve(A) 4- A = 'p(T) A P } . 

Note that necessarily p ^ A, since we assume, without loss of generality, that 
abducible predicates are not defined in P. 

• If P = solve('p(T)) and p ^ A, then 

M{C) = {p(x) <— X = t A P} . 

• If P = solve(A(T)), then 

M{C) = { p{t) ^ A = 'p A P I for every p ^ A and p solve } . 

• If P = solve(A), then 

M{C) = { p(x) ^ A = 'p(T) A P I for every p ^ A and p solve } . 
TZV{P) is given by the union of TZV{C) for all C G P. 

► Let (P, A, I) be an abductive logic program. The abducibility set of A, 
written as P{A), is the set of clauses: 

P{A) = { solve(A) ^ A = ^a('a;) A a(a;) | for all a G A} . 

The abducibility set of an abductive logic program allows a meta-level represen- 
tation of provability of abducible predicates. 

The intended connection among object-level, meta-level and abducible atoms 
is captured by the following definition. 

► Let (P, A, I) be an abductive logic program. 

The associated program of P with respect to A, written as A{P,A), is 
defined as: 



A{P,A) = PUTZV{P)UP{A). 
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The associated integrity constraints of I with respect to A, written as 
A{I,A), is defined as: 

A{I,A) = / U {solve('p('a;)) ^ p(x) | for all p G A}. 

The meta-abductive logic program associated with (P, A, I) is: 

{A{P,A),A, A{I,A)). 

The addition of the new integrity constraints in A{I, A) allows the agent to prop- 
agate (and thus compute the consequences of) any new information it receives 
about abducible predicates in whatever (meta-level or object-level) form, with- 
out any need to alter the original set of integrity constraints, J, or the program 
P. 



6 Adding communication to KS-agents 



In this paper, we interpret tell(X, Y) and told(X, Y) as abducible predicates 
in meta-abductive logic programs. As for solve we can give tell and told an 
additional argument instead of introducing labels, to represent communication 
between agents. For simplicity, we will abbreviate tell{Agent, X, Y) (resp. told) 
within the program P of Agent itself as tell(A, F). As with solve, we will 
assume that tell and told take only names of atoms as their last argument. 



Example 3. Let agent;^ be represented by the abductive logic program (P, A, I) 
with: 

{ solve(A) ^ told(A, A) 
desire(y) y = car 
good_price(p, x) ^ p = 0 

A = { tell, told, offer } 

/ = { desire(a:) A told(A,'good_price('p,'a;)) ^ tell(A,'of f er('p,lr)) } . 

Namely, agent believes anything it is told (by any other agent), and it desires 
to have a car. The third clause in P says that anything that is free is at a good 
price. Moreover, if the agent desires something and it is told (by some other 
agent) of a good price for it, then it makes an offer to the other agent, by telling 
it. Note that a more accurate representation of the integrity constraint should 
include time. 

The corresponding meta-abductive logic program is 
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A{P,A) = PU { 



solve(X) <— X = 'desire('?/) Ay = car 
solve(X) ^ X — 'good_price('p/a:) Ap — 0 
desire(x) ^ X — 'desire('x) Atold(yl, X) 
good_price(p, x) ^ X — 'good_price('p/x) A told(yl, X) 
solve(K) <— K = 'tell('A/X) Atell(yl, X) 
solve(K) <— K = 'told(Tl/X) Atold(Al, X) 
solve(F) ^Y— 'of f er('p/x) A offer(p, x) 



> 



A = I tell, told, of f er } 



( solve('tell(^,'X)) tell(A, X) 

A{I,A) =/U< solve('told(^,'X)) told(A, X) > 

[ solve('offer('p,'x)) offer(p, x) J 



Given an abductive logic program {P,A,I)i corresponding to some agent, and 
the associated meta-abductive logic program {A{P,A),A,A{I,A))^ the agent’s 
cycle applies the IFF procedure with 



T = comp^{A{P,A)). 

The equality rewriting rule needs to be modified to take into account equality 
between names. 

Example 4- Let {P,A,I) be the abductive logic program of example 3. Assume 
that agentj^ has the input observation: 

told('agent2,good_price('50, car)), 

meaning that agentj^ has been told by agent 2 of a good price (of 50) for a car. 
Then, the initial goal G is: 

told('agent2,'good_price('50,'car)). 

A computed answer for G is 

2? = {told('agent2,'good_price('50,'car)), tell('agent 2 ,'of f er('50,'car))} 
cr = {}. 

Within the KS-agent architecture, the following requirements are met: 

— KS-agents are known by their symbolic names. 

— When a KS-agent sends a message, it directs that message to a specific 
addressee. 

— When a KS-agent receives a message, it knows the sender of that message. 

— Messages may get lost. 
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Both tell and told are treated as actions: every time (the cycle of) an agent 
agent]^ selects a communicative action, i.e., an action of the form 

told(^agent2, A) or tell(^agent2, A), 

agentj^ will attempt to execute it. If the attempt is successful, the record 

told(^agent2, v 4 ) or tell(^agent2, A) 

is conjoined to the goals of agent agent as an additional input. If the attempt 
is not successful, the record 

told(^agent2, A) f alse or tell(^agent2, A) f alse 

is conjoined to the goals of agent agent as an additional input. In the next 
section we will formalise an example showing how proactive communication is 
achieved by executing the proof procedure within cycle. 

7 Example 

The following example demonstrates the running of the proof procedure within 
cycle, action selection, proactive communication whereby one agent volunteers 
information to another, and how during the planning phase such information 
can help in the choice of intention. 

The example is as follows: an agent wishes to register for a conference, let 
us say Jelia, that is to take place in Paris on the 10 th and to make travel 
arrangements to go to Paris on the 10 th (for simplicity we omit the month). So 
the agent’s original goal is the conjunction: 
register(Jelia) A travel(Paris, 10 ). 

The agent has the following program, P: 

Cl travel(czft/, date) ^ go(city, date, train) 

C2 tra.vel{city, date) ^ go(city, date, plane) 

C3 earlyjregistration(Jelia) ^ sendjform(Jelia, date) A {date < 3 ) 

C4 go{city, date, means) ^ hook{city, date, means) 

C5 hoo'k.{city, date, means) ^ 

told('ticket_agent,'available('czty,'date,'means))A 
tell(Ticket_agent,'reserve('cztt/, 'date, 'means)) 

Cq solve(X) ^ told('ticket^gent, X). 

Clauses Ci and C2 say that one travels to a city on a given date if one goes there 
on that date by train or by plane. C3 says that the deadline for early registration 
for Jelia is the 3 rd. C4 says that one goes to a city on a given date by some means 
if one makes a booking for that journey. C5 says that one makes a booking if 
the ticket_agent confirms availability of ticket and one makes a reservation. Note 
that in a more thorough representation we would represent the transaction time 
of when a booking is made (which must be before the time of travel). In that 
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case we will have an extra argument in tell and told that represents such 
transaction times. We will ignore this issue here for simplicity. 

The agent has the following set of integrity constraints I: 

I\ register( con/erence) ^ earlyjregistration( con/erence) 

I 2 go{city, date, means) A strike(czty, date, means) ^ false. 

7i states the agent’s departmental policy that anyone registering at a conference 
must take advantage of early registration. I 2 states that one cannot use a mode 
of transport which is subject to a strike. 

Finally, the agent has the following set of abducibles A'. 

{sendjform, tell, told, register, available, reserve, strike}. 

Now suppose that the cycle of the agent starts at time 1 and that the agent 
does not observe any input. Thus, the cycle effectively starts at step {iv) by 
applying the IFF proof procedure (using {A{P, A), A, A{I, Al)) which we do not 
show here) to the goal obtained by conjoining the original goal and the integrity 
constraints A{I,A). By repeatedly applying the IFF proof procedure, the goal 
is transformed (within the initial cycle or within some later iteration, depending 
on the resource parameter r) into 

register(Jelia) Atravel(Paris, 10) A sendjform( Jelia, date) A {date < 3). 
At this point cycle can select (at step (u)) the action sendjform(Jelia, date) to 
perform if time is less than 3. If the agent has been too slow and time 3 has 
already passed the agent has failed its goals. Suppose the agent succeeds. Note 
that at any time any other agent can send this agent a message. So suppose the 
ticket agent sends a message that trains are on strike in Paris on the 10th, i.e. 
at some iteration of cycle at step (i) the agent receives the following input 
told('ticket^gent, 'strike ('Par is, '10, 'train)). 

In that iteration of cycle this information is propagated (step (in)) and the 
simplified constraint 

go(Paris, 10, train) => false 

is added to the goal. Meanwhile the sub-goal travel(Paris, 10) is unfolded (step 
(iv)) into 

go(Paris, 10, train) V go(Paris, 10, plane). 

The information about the strike will be used to remove the first possibility (i.e 
the first disjunct) leaving only 
go(Paris, 10, plane) 

which will become the agent’s intention. This will be unfolded into the plan 
told('ticket^gent,'available('Paris,'10,'plane))A 
tell('ticket^gent,'reserve('Paris,'10,'plane)). 

The appropriate actions will be selected during some iterations of cycle. 

For lack of space we have ignored the cycle of the ticket_agent. 

8 Conclusions 

We have presented an approach to agents that can reason about their own beliefs 
as well as beliefs of other agents and that can communicate with each other. The 
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approach results from the combination of the approach to agents in [12] and the 
approach to meta-reasoning and communication in [6,1]- We have illustrated the 
approach by means of a number of examples. 

The approach needs to be extended in a number of ways. For simplicity, we 
have ignored the treatment of time. However, time plays an important role in 
most agent applications and should be explicitly taken into account. 

We have considered only two communication performatives, tell and told. 
Existing communication languages, e.g. [5], consider additional performatives, 
e.g. deny, achieve and unachieve. We are currently investigating whether some 
of these additional performatives could be defined via communication protocols, 
as definitions and integrity constraints within our framework. 

The primitive told is used to express both active request for information and 
passive communication. The two roles should be separated out, possibly with the 
addition of a third predicate ask, distinguished from told. Then the predicate 
told could be defined in terms of ask and tell, rather than be an abducible. 
For example, an agent agentj^ is told of X by another agent agent 2 if and only 
if agent 2 tells agentj^ of X or agentj^ actively asks agent 2 about X and agent 2 
gives a positive answer. 

We have implicitly assumed that different agents share the same content lan- 
guage. However, this assumption is not essential. Indeed, “translator agents” 
could be defined, acting as mediators between agents with different content lan- 
guages. This is possible by virtue of the metalogic features of the language. 
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Abstract. This paper presents an alternative view on propositional dis- 
junctive logic program: Disjunctive program = Control program 4- Horn 
program. For this we introduce a program transformation which trans- 
forms a disjunctive logic program into a Horn program and a so called 
control program. The control program consists of only disjunctions of 
new propositional atoms and controls the “execution” of the Horn pro- 
gram. The relationship between original and transformed programs is 
established by using circumscription. Based on this relationship a new 
minimal model reasoning approach is developed. Due to the transforma- 
tion it is straightforward to incorporate SLD-resolution into the proof 
procedure. 



1 Introduction 

Disjunctive logic programs may contain definite or indefinite information which 
reflects human limitation in understanding the world being modeled. In the 
absence of negation, that is, for positive disjunctive program, the semantics is 
defined by so called Generalized Closed World Assumption (GCWA) which is 
equivalent to minimal model reasoning [Min82] . GCWA allows one to assume an 
atom to be false if it does not appear in any minimal model of the program. 

There are various proof procedures proposed for disjunctive logic program- 
ming: Minker and his co-workers used SLI-resolution [LMR92], which is in fact 
a version of the model elimination procedure; Loveland proposed the near-Horn- 
Prolog procedures [Lov91] and there are variants of model elimination to deal 
with disjunctive logic programs [BFS97]. Besides this goal oriented approaches 
there are as well bottom-up procedures, like SATCHMO[MB88,LRW95] or Hyper 
tableau[BFN96]. For a unifying view of bottom up and goal oriented approaches 
see [BF97]. 

In this paper we propose a novel point of view on propositional disjunctive 
logic program which is motivated by the following observation. 

Given a disjunctive logic program V, for each clause C of the form 

Oi V ... V a„ ^ &i, ..., bm 
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we split C into a set Horn{C) of Horn clauses, 
Horn{C) = {ai ^ bi, bm, 

(l2 ^ b\j bjnj 



Un ^ ^ 1 , ■■■, } 

Considering the following two sets: 

Horn{V) = Horn{C) (2) 

cev 

Ph= \J Horn{C) (3) 

cev 

Horn{V) can be regard as a set of Horn programs, each Horn program in it 
contains exactly one Horn clause from each Horn{C) {C G V). It is know that 
this splitting is complete wrt. minimal models of P in the sense that for each 
minimal model M oiV, there is a Horn program Pm G Horn{P) such that M is 
the least model of Pm [IKH92,Lu 97]. On the other hand, Ph is a Horn program. 
Notice that for any Ph G Horn{P) Ph C Ph^, therefore we can say Ph wrt. 
minimal models of P is complete in the sense that for each minimal model M of 
P, there is a subset Pm of Ph such that M is the least model of Pm ■ So under 
minimal model semantics a disjunctive logic program can be regarded as a Horn 
program together with a control means which controls the execution of the Horn 
program (which clauses should be chosen to make a minimal model) . 

Based on this observation, in this paper we put forward 

Disjunctive Program=Control Program + Horn Program 

This point of view establishes a relationship between disjunctive logic program- 
ming and definite logic programs, the latter has a well-understood declarative 
semantics and an effective procedural implementation. As a consequence, we can 
expect to implement minimal model reasoning by using existing techniques (eg. 
SLD-resolution) developed for definite logic programs. 

Other contributions of this paper are as follows: 

— A program transformation is introduced which demonstrates the idea pre- 
sented above. By introducing some new propositional atoms it transforms a 
disjunctive logic program P into Pc U Ph, where Pc is a set of disjunctive 
facts consisting of only new propositional atoms and Ph is a Horn program. 
It turns out that this transformation is sound and complete wrt. minimal 
models in the sense that M is a minimal model of P iff there is a minimal 
model Me of Pc such that M U Me is (P, Z)-minimal model of Pc U Ph, 
where P are the set of all atoms in P and Z is the set of new propositional 
atoms introduced by the transformation. 

^ Here for convenience, we abuse an element in Horn{C) as a set 
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— Based on above result, a novel minimal model reasoning procedure is devel- 
oped which differs from similar work[LMR92,Gin89,Prz89,Ara96] in that it 
uses SLD-resolution as a basic inference mechanism. 

The rest of the paper is organised as follows: after reviewing logic programs 
and related background knowledge, we provide the transformation in Section 3. 
Some properties of the transformation are discussed and the soundness and com- 
pleteness are proved. In Section 4, we discuss how minimal model reasoning can 
be implemented based on SLD-resolution and develop such a procedure. The 
paper is finally concluded with some remarks on further work. 



2 Preliminaries 

Given a first order language L, a disjunctive logic program V is & finite set of 
disjunctive program clauses of the form 

oi V . . . V a„ ^ 6i A . . . A 6m 

where every Oi and bj are atoms with n > 1 and m > 0, and all variables are 
considered to be universally quantified. Oi V . . . V a„ is referred to as the head 
and 6i A ... A 6m as the body of the program clause. Usually non-monotonic nega- 
tions are allowed in the body of a clause, but in this paper we restrict ourselves 
to positive programs where such negations do not occur. This is not a severe 
restriction, since minimal model reasoning with positive programs is an impor- 
tant problem to tackle [NNS95, 01192, IKH92,Gin89,Prz89,Ara96]. When n = 1, 
the program clause is called as a Horn or definite program clause, and when 
TO = 0, it is called as a fact. Note that head of a program clause is not allowed 
to be empty, and consequently all disjunctive logic programs are consistent. A 
formula in disjunctive normal form, consisting only of ground atoms, is referred 
to as a sentence. The reader is referred to, for example, [Llo87,LMR92] and ref- 
erences therein, for more information on logic programming and disjunctive logic 
programs. 

The Herbrand base of the language L is usually denoted as HBl. When 
no ambiguity arises the subscript L is dropped. Usually, when we consider a 
program V, we are interested in a Herbrand Base that is restricted to predicate, 
function and constant symbols that appear in P. If P has no constant symbols 
then a dummy constant is assumed. In the sequel we call such Herbrand Base 
the Herbrand base of P, denote it as HB-p and simply write interpretation and 
model to denote Herbrand interpretation and Herbrand model resp. of a logic 
program. 

The meaning of a disjunctive logic program is given by the set of its logical 
consequences. Obviously, negative information can not be handled efficiently in 
this classical semantics, and hence generalised closed world assumption [Min82], 
referred to as GGWA for short, is usually employed to infer negative information 
from a disjunctive logic program (see [LMR92] for more discussion and details). 
GGWA allows one to assume an atom to be false, if it doesn’t appear in any 
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minimal model of the program. This has been independently extended for sen- 
tences in [YH85,GPP89]. All these versions are not fundamentally different, and 
for the purpose of this paper, we simply refer to the following definition of closed 
world assumption in disjunctive logic programming. 

In the following we always assume that the given program is instantiated, 
that is, all clauses in the program are ground and make use of the notion of 
minimal models, minimal entailment, written as V \=min ~^ct, and the immediate 
consequence operator Tj> for Horn programs. 

3 CH-Transformation 

In this section we first introduce a program transformation which realizes the 
idea presented in the introduction. Then we discuss the relationship between 
original and transformed programs and prove that under minimal model seman- 
tics they are equivalent. This result provides a basis for our further discussion. 

Definition 1. Let "P be a disjunctive logic program. For each clause C G P oi 
the form: 

ai V 02 V ... V o„ ^ bi , ..., bm 

The CH-transformation of C, denoted by CH{C), is a set of clauses defined by: 

CH{C) = ' ^ I 

^ ^ \ Cc/i : n > 1 

where 

Cch = { Cl V C 2 V ... V Cn, 

Oi b \, ..., bjn, Cl, 

<^2 ^ bi, ..., bm, C 2 , 



O'n bi, ..., bm, Cji } 

Cl, ..., Cn are new propositional atoms not in P, they are called control vari- 
ables, the disjunctive fact Ci V C 2 V ... V C„ is called a control clause. The 
CH-transformation of P, denoted by CH{P), is defined by 

CH{P) = y CH{C) 
cev 

The set of all control variables in CH{P) is denoted by C-p 

By the definition, the clause in CH{P) is either a Horn clause or a disjunctive 
fact (control clause). Therefore CH{P) can be represented as follows: 

CH{P) = Pc + Ph 

where Pc is the program consisting of all control clauses in CH{P) and Ph 
is the Horn program consisting of all Horn clauses in CH{P). Pc is called the 
control program of Ph- The following facts are trivial. 
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— The disjunctions in Vc consists of only control variables. No control variable 
occurs more than once in Vc- 

— Each Horn clause in Vh contains at most one control variable. No control 
variable occurs two times in Vh- 

— Each control variable occurs exactly two times in CH{V), once in Vc and 
once in Vh- 

For any clause C of the form 

Oi V 02 V ... y bi, bm 

let Horn{C), Horn{V) and Vh be given by formulae 1, 2 and 3 in the intro- 
duction resp. For any clause C G Vh, we denote C the clause obtained from C 
by deleting the control variable in the body of C (if any). Below we state the 
relationship between Horn{V) and CH{V). 

Definition 2. Let "P be a disjunctive logic program and C-p be the set of all 
control variables. For any S C Cp, the subprogram of Vh determined by S, 
written as Vh{S), is defined by 

Vh{S) = {C \ C G CH{V) such that C contains 
either no control variable or one in S } 

Notice the fact that when without considering the control variables, Vh and 
Vh are same and for a minimal model Me of Vc, it contains one and only 
one control variable from each control clause, then the following proposition is 
trivial. 

Proposition 3. LetV he a disjunctive logic program. Then for any Vh G Horn(V), 
there is a minimal model Me ofVc such that Vh{S) = Vh 

Because Horn{V) is complete wrt. minimal model in the sense that for any 
minimal model M of V, there is aVh G Horn{V) such that M is the least model 
of Vh- Therefore we have 

Theorem 4. Let V he a disjunctive logic program. Then for any minimal model 
M of V, there is a minimal model Me of Vc such that M is the least model of 
Vh{Mc). 



Example 5. let 



V = {p^ q,r 
q ^ r 
r 

s ^ ri 
t^n 



py s 

ty u ^ r } 
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then the CH-transformation of V is as follows. 

CH{V)^Vc^Vh 



where 



Vc = {A^y A2, 
B\ V i?2 } 



Vh = {p^ q,r 
q ^ r 
r 



s ^ Ti 
t^n 
Ai 
s ^ A 2 

u ^ r,B2 } 



and C-p = { Ai, A 2 , Bi, B 2 }■ Me = { Ai, B\ } is a minimal model of Vc 
and 



Vh{Mc) = {p 4- g, r 
q ^ r 
r 

s ^ ri 
t^ri 
P 

t ^ r } 

Now we discuss the relationship between program P and CH{V). To do so 
we need some notations about circumscription [McC80,GPP89,Lif85] 

Given first-order theory T, let P and Z be disjoint tuples of predicates from 
T, then circumscription of P in T with variable Z is defined as the second-order 
formula: 



Circ{T- P; Z) = T{P, Z) A ^3P'Z'{T{P', Z') A P' < P) 

where P(P, Z) is a theory containing predicate constants P, Z, and P' , Z' are 
tuples of predicate variables similar to P, Z. The set of all predicates other than 
P, Z from T is denoted by Q, which is called the fixed predicates. 

Definition 6. For any two models M and N of T we write M < N mod{P, Z) 
if models M and N differ only in how they interpret atoms from P and Z and 
if the extension of every predicate from P in M is a subset of its extension in 
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N. A model M of T is called (P, Z)-minimal if there is no model N of T such 
that N < M mod{P, Z) (ie. such that M < N mod{P, Z) but not N < M 
mod{P, Z)). 

Note: When T is a propositional theory, the definition of the relation < can 
be restated as follows: For any two models M and N of T we write M < N 
mod{P, Z) if models M and N differ only in how they interpret atoms from P 
and Z and M n P C N C\ P. 

The following theorem is due to [McC80,GPP89,Lif85], which explains the 
semantics of circumscription. 

Theorem 7 . M is a model of Circ{T; P; Z) iff M is a {P, Z) -minimal model 
ofT. In other words, for any formula F, we have Circ{T; P; Z) \= F iff M \= F 
for every (P, Z) -minimal model M ofT. 

Let P be a disjunctive logic program and CH(V) = Vc U Vh be the CH- 
transformation of P. In the following we always let P = HB-p and for any model 
M of CH{P), denote Me = MC\Cp and Mh — MC\P and write M = McUMh- 

Lemma 8. Let V be a disjunctive logic program and CH{V) = Vc U Vh- If 
M — Me U Mh is a model of CH{V), then Mh is a model ofV. 

Proof: Let M = Me U Mh be a model of CH{V). If Mh is not a model of P, 
then there is a clause C of form 



Oi V 02 V ... V o„ ^ bi, bm 

such that bi,...,bm are true in Mh and for any 1 < f < n, o^ ^ Mh- It must 
be that n > I, otherwise C G CH{V), which contradicts that M is a model of 
CH{V)- Let Cl V C 2 V ... V C„ be the control clause in CH{C)-, because M is 
a model of CH{V), there is f (1 < f < n) such that Ci G M, then 61 , ..., bm, Ci 
must be true in M, therefore, Oi G M. But o^ occurs in P, so Oi G Mh- This is 
a contradiction and we conclude that Mh is a model of P. 

Theorem 9. Let V be a disjunctive logic program and CH{V) = Vc U Vh- 
Then M is a minimal model of V iff there exists a minimal model Me of Vc 
such that Me U M is an (P, Cp) -minimal model of CH{V)- 

Proof: Let M be a minimal model of P, we prove that there is a Me such that 
Me U M is an (P, Cpj-minimal model of CH{V)- For the minimal model M, 
define Me as follows: For each clause C of the form 

Oi V 02 V ... V o„ ^ 61 , ..., bm 

in P, where n > 1, let Ci V C 2 V ... V C„ be the control clause in CH{C). If 
bi,...,bm is true in M, then there exists i such that Oi G M, let Ci G Mc- 
Otherwise let Ci G Me- Nothing else is in Me- It is clear that Me is a minimal 
model of Vc- 
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Me U M is a model of CH{V). If not so, then there is a clause of ^ 
bi, ...,bm,Ci such bi, ...,bm,Ci is true in Me U M but is not in M. By the 
construction of Me this is impossible because in this case Ci G Me if and only 
if ai £ M. 

Me U M is an (P, C73)-minimal model of CH{V). If not so, then there is a 
minimal model Mq U of CH{V) such that C M, By lemma 8, is 
a model of V. This contradicts that M is a minimal model of V. 

To prove the other direction of the theorem, let Me be a minimal model of 
Ve such that Me U M is a (P, C'73)-minimal model of CH{V), then by lemma 8, 
M is a model of P. If M is not a minimal model, then there is a minimal model 
M' of V such that M' C M, for this M' , we can define Mq as above such that 
MqUM' is an (P, C73)-minimal model of CH{V) but M' C M . This contradicts 
that Me U M is an (P, C73)-minimal model of CH{V). 

By theorem 7 and theorem 9 we finally conclude 

Theorem 10. Let V be a disjunctive logic program and CH{V) — Ve U Vh- 
Then for any formula a which contains no new propositional atoms, we have 
V hmm a iff Circ{CH{V)-, P; Cv) h a. 

4 Query Answering under Minimal Model Semantics 

In this section we discuss how to answer queries under minimal model seman- 
tics. Because positive query can be answered by any existing theorem prover, 
for example PROTEIN [BF94], we concentrate only on negative query, that is, 
for a given atom q, to evaluate if ~^q is true in all minimal models. The algo- 
rithm presented below can be combined with a existing theorem prover to get 
a sound and complete minimal model reasoning procedure. An example of such 
an approach can be found in [Ara96] 

Our method is based on SLD-resolution due to the CH-transformation. An 
SLD-resolution proof procedure provides a way to compute answers for a goal 
from a definite logic program. It starts with a goal clause, ^ ai,...,a„, and 
provides a refutation by deriving an empty goal clause □. The SLD derivation 
for definite propositional logic programs can be stated as follows. 

Definition 11 (SLD derivation). Let P be a definite logic program and G be 
the goal <— ai, ..., Om, An SLD derivation from P with top goal G consists 

of a (finite or infinite) sequence of goals Gq(= G), Gi, ..., such that for all i > 0, 
Gi+i is obtained from Gi as follows: 

1. Om is an atom in Gi and is called the selected atom. 

2. Om ^ bi , ..., bq is a program clause in P. 

3- Gi-i-i is the goal ui, ..., a^n—i, b ±, ..., bq, Um+ii ■■■? 

SLD-derivations can be finite or infinite. An SLD refutation from P with top 
goal is a finite SLD derivation of empty set, from P. A finite SLD-derivation 
can be successful or failed. A successful derivation is one that ends in the empty 




Disjunctive Logic Program = Horn Program + Control Program 



41 



clause, that is, it is a refutation. A failed SLD-derivation is one that ends in a non- 
empty goal with the property that no atom in it occurs in the head of clauses 
in V. SLD resolution is the system that uses SLD derivation as an inference 
mechanism. 

Definition 12. Let "P be a definite program, G a goal and R a computation 
rule. Then the SLD-tree for V U {G} via R is defined as follows: 

1. Each node of the tree is a goal(possibly empty). 

2. The root node is G. 

3. Let ^ oi, ..., am, ■■■, ak {k > 1) be a node in the tree and suppose that am is 
the atom selected by R. Then this node has descendent for each input clause 
dm ^ bi,.--,bq. The descendent is 

^ 0^1, ..., 6i, ..., bqy ..., ak 

4. Nodes which are empty clause have no descendents. 

Each branch of the SLD-tree is a derivation of "PUlG}. Branches correspond- 
ing to successful derivation are called success branches, branches corresponding 
to infinite derivation are called infinite branches and branches corresponding to 
failed derivation are called failure branches. 

It has been proved that SLD-resolution is sound and complete wrt. definite 
programs. In propositional case, it can be stated as follows: 

Theorem 13. Let V he a definite logic program and G ai , ..., a„ be a goal. 
Then V |= ai, ..., a„ iff there is SLD-refutation for V and top goal G. 

Now we are in a position to present a proof procedure for negative query 
under minimal model semantics of disjunctive logic program based on CH- 
transformation and SLD-resolution. The following trivial facts can help us to 
understand the basic idea behind the method. 

Proposition 14. Let V be a disjunctive program. Considering CH-transfor- 
mation ofV 

GH{V) = Vc + Vh 

denote Vh the Horn program obtained from Vh by deleting all control variables. 
Then 

1. For any a G T-p^^ f lo, V hmm «• 

2. For each a G Tp^ f uj, there is a model M of V such that a is true in M 

3. Lf M is a minimal model of V, then M C Tp^ f u> 

To prove ~^q, assume i? to be a computation rule that only selects atoms from 
HBp, we construct a SLD-tree T for Vh U d}- Let T be the tree obtained 
from T by deleting all control variables, then T is a SLD-tree for Vh U q}. 
For a branch & in T let b denote the corresponding branch in T. A branch b in 
T may be 
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1. a success branch. By 1 of Proposition 14, it means q is true in all minimal 
models of V. 

2. an infinite branch. Then b is also an infinite branch. 

3. a failure branch and ends in a non-empty goal containing atoms in V (and 
also possibly some new propositional atoms from C-p). This case corresponds 
to a failure branch in T. 

4. a failure branch but ends in a non-empty goal consisting of only new propo- 
sitional atoms. Let the leaf node of the branch b be ^ Ni, Nm (m > 0). 
In this case b corresponds to a success branch 6 in T, by 2. of Proposition 14, 
q is true in some model of "P. Notice the fact that Ni {1 < i < m) indicates 
that the clause with Ni in its body have been used as input clauses in the 
derivation corresponding branch b, therefore for any model of CH{V), if it 
contains Ni,...,Nm, then q must be true in it. Among them if there is a 
(P, C73)-minimal model of CH{V), then by Theorem 9 we can conclude that 
there is a minimal model of V in which q is true. 

Hence, the problem of answering a query G is reduced to answering the 
following problem: 

Let S' be a subset of Cp, is there a (P, C73)-minimal model M = MqUMh 
of CH{V) such that S C Me? In other words, is there a model M = 

Me U Mh of CH{V) such that S C Me and Mh is a minimal model of 
P? 

There are many ways to test the minimality of a model [Lu97,Nie96]. The 
following result is due to [Nie96] 

Proposition 15. [Nie96] Let P be a set of clauses. An interpretation M is a 
minimal model ofV iff M is a model ofP and for every atom a, M \= a implies 
V U Np{M) 1= a, where Np{M) = {-ia \ a is an atom appearing in the 
head of a clause in P and M ^ a}. 

Now the problem can be solved as follows: 

1. Computing all models containing S of Pe, denote it by MM{Pe, S) (it can 
be done by any existing model generation procedure [Lu97,Nie96] . 

2. For each Me G MM{Pe, S), computing the least model of PniMe)'^ and 
testing if it is a minimal model of P using Proposition 15. 

Let us summarize the procedure for minimal model reasoning. 

Algorithm 

Input: A disjunctive logic program P and an atom q. 

Output: P \=min^q, if uo minimal model contains q, otherwise either P \=min q 
or a minimal model in which q is true. 

1 . Transform P into Pe + Ph ■ 

^ This can be done by Hyper tableau or the immediate consequence operator for Horn 
programs. 
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2. Construct a SLD-tree T for Vh and ^ q 

3. If there is a success branch in T then return V \=minq, otherwise 

4. If there is failure branch ending with subgoal consisting of only control vari- 
ables and if there is a (P, C'p)-minimal model McUMp such that the subgoal 
is contained in Mq, then return M-p, otherwise 

5. return V 

Before ending this section we illustrate the algorithm with an example. 

Example 16. Consider the program V from example 5. Given goal ^ p. A SLD- 
tree for the Horn program Vh and goal ^ p is depicted in Fig. 1. In this 
SLD-tree, there is a success branch, and hence we can conclude V \=minP- 
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Fig. 2 gives an SLD-tree for the Horn program Vh and the goal ^ s. It 
contains no success branch but has a branch ending with a subgoal ^ H 2 . 
Therefore a test is needed. There are two minimal models {A 2 , Bi} and {A 2 , B 2 } 
of Vc containing H 2 . The models of CH{V) “controlled” by the two models are 
Ml = Tvh T w({H 2 ,Bi}) = {H 2 ,Bi,r,p, g, s,t} and 

M 2 = T-Ph T u:{{A 2,B2}) = {A2,B2,r,p,q,s,u}. 

Both Ml, M 2 are not (P, C'73)-minimal models, therefore we conclude V \=min^s. 

Fig. 3 gives an SLD-tree for the Horn program Vh and goal ^ t. It also 
contains no success branch, but it has a branch ending with a subgoal ^ Pi. 
Therefore a test is needed. There are two minimal models {Hi, Pi} and {A 2 , Pij 
of Vc containing Pi, the models of CH{V) “controlled” by the two models are 
Ml = Tpjj t u;({Hi,Pi}) = {Ai,Bi,r,p,q,t} and 
M 2 = Tp^ t w({H 2 ,Pi}) = {A 2 ,Bi,r,p,q,s,t} 

By test. Ml is a (P, C'p)-minimal models and by Theorem 9, M = {r,p, q, t} is 
a minimal model of P. So we return M as the output. 



5 Conclutions and Further Work 

In this paper we proposed a novel point of view on disjunctive logic programming. 
We gave a transformation which transforms a disjunctive logic program V into 
a special disjunctive logic program CH{V) = Vc UP//, where Vc is control 
program and Vh is a Horn program. Then the relationship between V and 
CH{V) is established. Based on this relation we developed a minimal model 
reasoning procedure which differs from similar work in that the SLD-resolution 
can be directly incorporated into it, due to the transformation. 

Many interesting topics remain to be done. Currently we are working on the 
following topics: 

— Optimizing the query answering procedure. Many optimizations in the min- 
imal test phase are posible. For example, the minimal model generation 
method based on E-hyper tableau [Lu97] can be used in the procedure to 
improve performence. 

— Extending the idea for normal logic program under stable model semantics. 

— Extending this method for general disjunctive logic program under stable 
model semantics. 

— Extending this method for non-ground case. 
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Abstract. Partial-order programming is introduced in 
[JOM95] where it is shown how partial-order clauses help render clear 
and concise formulations to a different kind of problems, in particular 
optimization problems. In this paper we present some more examples 
that we can model using partial-order clauses and we also introduce its 
Fix-Point semantics. We show that this paradigm and standard logic 
programming can be naturally integrated in one paradigm. We also dis- 
cuss WFSCOMP, a new semantics for normal programs, that can be 
used to give the meaning of general normal-tpartial-order programs via 
a translation. 



1 Introduction 

In this paper, we study the semantics of a logic language whose principal building 
blocks are partial-order -t- normal program clauses and complete lattice data 
types. The motivation for our work, however, is very practical in nature: We claim 
that partial-order clauses and lattices help obtain clear, concise, and efficient 
formulations of problems requiring the ability to take transitive closures, solve 
circular constraints, and perform aggregate operations. In a way, our paradigm 
give us a high level logical notation to represent problems that are difficult to 
express using only normal clauses. 

We present several examples that are naturally expressed in this paradigm. 

Example 1.1 (Matrix chain product). 

This is a well known example that can be efficiently solved using dynamic pro- 
gramming [Sti87]. Suppose that we are multiplying n matrices This 

program finds the minimum number of scalar multiplications required to com- 
pute the given task. 

c(I,I) < 0 size(N), 1<I< N 

c(J,K) < c(J,I)+c(I+l,K) + r(J)*c(I)*c(K) J<I< K-1 

where we encode the size of matrix Mi by r ( I ) and c ( I ) , and we suppose that 
c(I)=r(I+l). The functions r,t have to be provided as part of the code. The 
logic of the matrix chain product problem is very clearly specified in the above 
program. The program is well defined because the condition J<I< K-1 ensures 
that circularity is not present at all. The codomain of function c is the set of 
Natural numbers. The Natural numbers can be extended to a complete lattice 
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by adding a top element T. The function + has to be extended such that for 
instance n + T = T. These details are transparent to the user. Here the bottom- 
up computation of the fix-point semantics behaves very much like the dynamic 
programming technique. 

Example 1.2 from [JOM95] (Shortest Distance). 

short(X,Y) < C edge(X,Y,C) 

short (X, Y) < C+short(Z,Y) edge(X,Z,C) 

Again, we think that the logic of the shortest-distance problem is very clearly 
specified in the above program. The program is well defined even if the exten- 
sional database defined by edge defines a directed graph with cycles. 

Example 1.3 (The 0-1 knapsack problem). 

This is a well known optimization problem that is known to be NP-complete. If 
however, the capacity, is of order p(n) for some polynomial p(n), the dynamic 
programming runs in time of order p(n)n, so we have a polynomial algorithm 
[Sti87]. 

kn(I,M) > 0 

kn(l,M) > kn(l - 1,M) :- I > 1 

kn(I,M) > kn(I - 1, M - c(l)) + g(i) :- I > l,c(l) < M 

Example 1.4 from [JOM95] (Reach). 

Sets also have a partial order. We consider finite sets and again we can complete 
them to ensure a complete lattice. The program works no matter edge defines 
a direct graph with cycles. 
reach(X) > {X} 

reach(X) > reach(Y) :- edge(X, Y) 

We refer the reader to the reference [JOM95] for more examples illustrating 
the use of set patterns and partial-order clauses. In [J098a] we present an op- 
erational semantics that combines top-down goal reduction, memo-tables and a 
fix-point procedure for solving functional- constraints. One contribution of this 
paper is to observe that our high level notation allows us to use dynamic pro- 
gramming to solve this kind of problems, getting an efficient computation for 
them. 

In [OJ97] we showed how to model these programs by translating them to 
standard normal clauses. In particular, we showed in [OJ97], that using only 
the stratified semantics we can capture the meaning of a large class of partial- 
order programs. This line of research is extended in further detail in [J098b]. 
We present here a simple translation that do not require a program to become 
stratified. However, we have to define a new semantics to capture the intended 
meaning of the translated programs ( since WFS, STABLE and COMP failed). 
We call this semantics WFSCOMP because it combines WFS and COMP (the 
two-valued Clark’s completion semantics) in a suitable way. This semantics ap- 
pears interesting in its own right as we will show. 

Our paper is organized as follows. Section 1 is the introduction where we give 
several examples to motivate our approach. In section 2 we give the background. 
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most of this material is taken from [JOM95]. In section 3 we present the fix-point 
semantics for partial-order programs. In section 4 we discuss the translation to 
normal programs and the use of WFSCOMP. Finally, we present our conclusions. 

2 Background 

We present the basic background of partial-order programs as introduced in 
[JOM95]^. There are two basic forms of a partial-order clause^: 

f (terms) > expression condition 

f (terms) < expression condition 

condition : : = goal \ goal, condition 
goal ::= p(terms) \ p (terms) 

where the predicate p appearing in p (terms) above is an extensional database 
predicate, i.e., one that is defined by ground unit clauses. Terms are made up 
of constants, variables, and data constructors, while expressions are in addi- 
tion made up of user-defined functions, i.e., those that appear at the head of 
the left-hand sides of partial-order clauses. Informally, the declarative meaning 
of a partial-order clause is that, for all its ground instantiations (i.e., replac- 
ing variables by ground terms), the function / applied to argument terms is > 
(respectively, <) the ground term denoted by the expression on the right-hand 
side, if condition is true. In general, multiple partial-order clauses may be used in 
defining some function /. We define the meaning of a ground expression f( terms ) 
to be equal to the least-upper hound (respectively, greatest-lower bound) of the 
resulting terms defined by the different partial-order clauses for /. A program 
consists of a finite set of partial-order clauses and a finite extensional database. 

We now present a model-theoretic semantics for partial-order clauses. For 
simplicity of presentation, we consider only > clauses in this section; the treat- 
ment of < clauses is symmetric. As noted before, we do not consider the defi- 
nition of a function using a combination of < and > clauses. This is why the 
semantics of > clauses can be given in a modular way, without any possibility 
of interference from < clauses, and vice versa. We also consider functions with 
only one argument but our results carry out straightforward to the general case. 
Note, however, that this argument can be a general term (which could simulate 
a multi-argument function using a list), and hence there is no loss of generality 
by this assumption. 

In preparation for the semantics, we first reduce the program respect to the 
extensional database and then we use the flattened form for all reduced clauses 
and goals. We use an example to explain the reduction. Let E be as follows: 
edge(a,b,2) 
edge (a, c, 3) 

Then the reduction of program Shortest distance respect to edge becomes: 

^ Definitions and Propositions are taken from [JOM95] 

^ The operational semantics introduced in [JOM95] we also need that where each 
variable in expression also occurs in terms 
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short(a,b) < 2 
short (a, c) < 3 
short(a,Y) < 2+short(b,Y) 
short(a,Y) < 3+short(c,Y) 

The idea of flattening has been mentioned in several places in the literature 
[G*87,Hol89,Jan94,JJ97]. We follow the definition given in [Jan94], and we il- 
lustrate it by a simple example. 

Example 2.1. Assuming that f , g, h, and k are user-defined functions and the 
remaining function symbols are constructors, the flattened form of a clause 
f(c(X,Y)) > cl(c2(g(c3(X)) , k(dl(h(d2(Y,l)))))). 
is as follows: 

f (c(X,Y)) > cl(c2(Yl, Y3)) g(c3(X)) = Yl, h(d2 (Y, 1) ) = Y2, 

k(dl(Y2)) = Y3. 

In the above flattened clause, we follow Prolog convention and use the notation 
: - for ‘if’ and commas for ‘and’. All variables are to understood to be universally 
quantified at the head of the clause, as is customary for definite clauses. 

The general form of a flattened clause is 
Head Body 

where Head is f (t) > u, and t and u are terms, and Body is of the form Ei, . . ., 
E„. Each Ei is fi(ti) = yi, where each is a user-defined function symbol, each 
yi is a new variable not present in Head, and each ti is a term that is equivalent 
to the argument of fi in the original, unflattened program clause. Each formula 
fj(fi) = Vi in Body is called a basic goal, and a sequence of basic goals is called 
a goal sequence. 

The order in which the basic goals are listed on the right-hand side of a 
flattened clause is the leftmost-innermost order for reducing expressions [Man74]. 

Finally, note that the flattened form of a query is similar to that of Body. In 
order to capture the T-as-failure assumption, we assume that for every function 
symbol / in P, the program is augmented by the clause: f (X) > T. 



2.1 Model-Theoretic Semantics 

We will work with Herbrand interpretations, where the Herbrand Universe of a 
program P consists only of ground terms, and is referred to as C/p. The Herbrand 
Base Bp of a program P consists of ground equality atoms of the form f{t) = u, 
where / is a user-defined function, t is a ground term, and tt is a ground term 
belonging to some complete-lattice domain^. Henceforth, we will always use the 
symbol / to stand for a user-defined (i.e., non-constructor) function symbol. 

We develop the model-theoretic semantics without reference to the details 
of specific lattice domains, such as sets, numbers, etc. This allows our presen- 
tation to focus on the essentials of partial-order clauses without digresssing to 
discuss the axiomatizations (equational theories) of specific data domains. A full 

® Since the program is reduced by the extensional database we do not have to include 
atoms of the form pit). 
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treatment of the logical foundations of the set constructors described in section 
1 is given in [Jan94,JJ97], and we refer the reader to these sources for more 
information. However, in giving examples to illustrate certain points about the 
semantics, we will need to make use specific data domains. An intuitive under- 
standing of these domains suffices for the examples. 

Due to the equational theories for constructors, the predicate = defines an 
equivalence relation over the Herbrand Universe. But, we can always contract 
a model to a so-called normal model where = defines only an identity relation 
[Men87] as follows: Take the domain D' of I to be the set of equivalence classes 
determined by = in the domain Up of I. Then use Herbrand ^-interpretations, 
where = denotes that the domain is a quotient structure. We then should refer 
to elements in D' by [t], i.e. the equivalence class of the element t, but in order to 
make the text more readable, we will refer to the [t] elements just as t, keeping 
in mind that formally we are working with the equivalence classes of t. These 
details are explained in [Jan94,JJ97]. 

We assume that every interpretation I includes certain equality and inequal- 
ity atoms of the form t\ — t 2 and t\ < t 2 according to the fixed intended 
interpretation of them in the program. 

We also assume that, in every interpretation I, f is interpreted as a total 
function, i.e., 

(i) (Vt e Up){3u G Up) f{t) = ue I; and 

(ii) f{t) =tiGl and f{t) = t 2 & I h = t 2 - 

Definition 1 ([JOM95]). 

Let P be a program. An interpretation M is a model of P, denoted by M ^ P, 
if for every ground instance, i{t) >t\ Ei . . . E^, of a > clause in P, if {Ei 
• • • Efe} C M then exists an atom f{t) = u G M and u> ti. 

We first briefiy motivate our approach to the model-theoretic semantics. Ba- 
sically, we define the semantics of a function call f{t), where t is a ground term, 
to be the gib {greatest lower bound) of all terms defined for f{t) in the different 
Herbrand models for / (the definition of model is given below). To see we need 
to take such gibs, consider the following trivial program P: 
f(X) > 1 

Here, we assume that the result domain for f is the lattice of totally-ordered 
numbers. A': 0 < 1 < 2 < ...T, for some T. Each model of P interprets f as a 
constant function: 

f (X) = 1, for all X G C/p 
f (X) = 2, for all X G C/p 

f (X) = T, for all X G C/p 

The intended model for function f , namely, f (X) = 1, for all X G Up, is obtained 
not by the classical set-intersection (n) of all models, but instead by the □ of 
the terms defined for f{t) in the different models. In the above example, □ is, of 
course, the min operator on numbers. 
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But not all syntactically well-formed programs have a well-defined meaning. 
Circularity in function definitions is allowable as long as this occurs through 
monotonic functions. Consider the following program over the boolean lattice: 
a > c(b) 
b > c(a) 
c(±) > T 

This program has several models and no one with an intended meaning. Thus, 
non-monotonic functions are permissible as long as there are no circular defini- 
tions through such functions. This motivates our interest in stratified partial- 
order programs. We begin the discussion of this topic with strongly-stratified 
programs, defined below, and continue the discussion with cost-monotonic pro- 
grams in section 2.2. 

Definition 2 ([JOM95]). 

A program P is strongly-stratified if there exists a mapping function, level : F 
M, from the set F of user-defined (i.e., non-constructor) function symbols in P 
to (a finite subset of) the natural numbers N such that: 

(i) All clauses of the form 
f{term) > term 

are permitted 

(ii) For a clause of the form 
f{term) > g{expr) 

where / and g are user-defined functions, level{f) is greater or equal to level{g) 
and level{f) is greater than level{h), where h is any user-defined function symbol 
that occurs in expr. 

(iii) no other form of clause is permitted 

Note that we have given the above definition using the non-flattened form of 
program clauses because the definition is easier to understand this way. Although 
a program can have different level mappings we assume that we select one that 
has as image a set of consecutive natural numbers that includes 1. For example, 
in the reach program shown before, the function edge would be at level one, and 
the function reach would be at level two. The above definition of stratification 
is, in another sense, very restrictive: it requires a function at any level to be 
directly defined in terms of other functions at the same level. For instance, the 
programs in examples 2.5 and 2.6 are not strongly-stratified. We therefore relax 
this requirement in section 2.3 wherein we introduce cost-monotonic programs. 

Definition 3 ([JOM95]). 

Let P be a set of strongly-stratified program clauses. We define Pk as those 
clauses of P for which the user-defined function symbols on the left-hand sides 
have level < k. 

Definition 4 ([JOM95]). 

Given two interpretations I and J for a program P, we define / C J if for every 
f{t) = ti € I there exists f{t) = t 2 G J such that t\ <t 2 ■ We say J = J if J C 
J and J F L 
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We will construct the model-theoretic semantics of a strongly-stratified pro- 
gram level by level. Thus, in defining models at some level j > 1, all functions 
from levels < j will have their models uniquely specified. Hence, all interpreta- 
tions of clauses at some level j will contain the same atoms for every function 
from a level < j. For this reason, we will overload the meaning of the function 
level and use the notation level{A) to refer to the level of the head function 
symbol of atom A: 

Definition 5 ([JOM95]). 

For any interpretation /, := { A : A e / A level{A) < k}. 



Definition 6 ([JOM95]). 

For any two interpretations I and J of a program P, 

7 n J ;= {/(t) = uF\u' : f{t) = u G I, f{t) =u' G J, f a, function symbol of 
P,tGUp} 

Definition 7 ([JOM95]). 

For any set X of interpretations, r\X is the natural generalization of the previous 
definition. 



Proposition 8 ([JOM95]). 

Let X be a set of models for a program P with j levels such that for any I G X 
and J G X, Ij-i= Jj-i- Then F\X is also a model. 

Definition 9 ([JOM95]). 

Given a program P with j levels, we define the model-theoretic semantics of P 
as: 

for j = 1, M{Pi) := n{M : M \= Pi}, and 

for j > 1, M{Pj) := n{M : = M{Pj-i) and M ^ Pj}- 



Definition 10 ([JOM95]). 

Given a program P with j levels and a goal sequence G, we say that substitution 
0 is a correct answer for G if M{Pj) ^ G9. 



2.2 Cost-Monotonic Programs 

The strongly- stratified language defined in section 2.1 permits the definition of 
one function directly in terms of another function at the same level or lower level. 
However, the cost-monotonic language defined below permits the definition of 
one function in terms of another function at the same level using monotonic 
functions. In the following definitions, as before, we assume functions with one 
argument. 
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Definition 11. A function / is monotonic if ti < t 2 /(^i) ^ /(^ 2 )- 



Definition 12 ([JOM95]). 

A program P is cost-monotonic if there exists a mapping function, level : F Af, 
from the set F of user-defined (i.e., non-constructor) function symbols in P to 
(a finite subset of) the natural numbers Af such that: 

(i) Every clause as defined in 2.2 is permitted. A clause of this form is called S-S 
clause (S-S stands for strongly-stratified). 

(ii) For a clause of the form 
f(terms) > m{g{expr)) 

where m is a monotonic function, level{f) is greater than level{m), level{f) is 
greater or equal to level{g) and level{f) is greater than level{h), where h is any 
function symbol that occurs in expr. A clause of this form is called a G-S clause 
(G-S stands for general-stratified). 

(iii) no other form of clause is permitted. 



In the above definition, note that / and g are not necessarily different. Also, 
non-monotonic “dependence” occurs only with respect to lower-level functions. 
We can in fact have a more liberal definition than the one above: First, since a 
composition of monotonic functions is monotonic, the function m in the above 
syntax can also be replaced by a composition of monotonic functions. Second, 
it suffices if the ground instances of program clauses are stratified in the above 
manner. This idea is, of course, analogous to that of local stratification [Prz88], 
except that we are working with functions rather than predicates. It should be 
clear that the presence of monotonic functions does not call for any alteration 
of the model-theoretic semantics. 

Finally, we would like to note that in general it is not decidable that we 
can syntactically check whether a function definition is monotonic. For certain 
domains, such as sets, is possible to detect the monotonicity property in many 
(but not all) cases by a simple syntactic check. 



3 Fix-Point Semantics 

The set of interpretations defines a complete lattice, where the bottom interpre- 
tation is the one where every function evaluates to T and the top interpretation 
is the one where every function evaluates to T. These interpretations are or- 
dered as given in definition 2.4. We provide a fix-point characterization of the 
declarative semantics. We will define a Tp operator (monotonic over a program 
layer with respect to G) that maps interpretations of a given program layer 
into interpretations for the same program layer, where the fix-point semantics 
of lower level subprograms has been already computed. Unless stated otherwise, 
we assume that every program is cost-monotonic. 
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Definition 13. The definition of Tp is as follows: 

Tp(J) := {/(f) = s : / is a functional symbol, t is ground term, and := 
gts{f{t),I)} 
where 

gts{e,I) := lub{{t{ : e>t\ cond G P and cond is true in /} 

Even though the mapping Tp is, in general, not monotonic, it does have an 
important property similar to monotonicity for stratified normal programs, as 
described in [Llo87]. The property is the following. 



Lemma 14. Let P be a program. We define its Fix-Point Semantics, denoted 
byT(P), as follows: 

Suppose P = P\, then Tp is monotonic over the lattice of interpretations for P, 
and so T (P) := LFP(Tp) is well defined. Where LFP denotes the least fix-point. 

Suppose P = P^pi, k>l. Let 

Ap .•= { L: L is an interpretation for P, where Ik = T{Pk) 

Let T(P) := LFP(Tp), (Tp over Ap) 

It is not hard to see that T is well defined. 



Theorem 15. For any program P, A4{P) = T{P). 

To compute the fix-point semantics, we can use the naive approach to compute 
the stratified semantics ( very much as it is standard in normal programs) . That 
is, we compute the stratified semantics level by level. To compute the stratified 
semantics at a given level we start with the bottom interpretation and then 
we iterate Tp to compute the fix-point semantics. Not always we arrive to the 
fix-point semantics in a finite number of steps, as the following program shows: 
a > 1 -|- a 

Clearly, only at oj number of steps we arrive to the least fix-point, i.e. to a = T. 
However, every program that we have tried that comes from a “real” problem 
was solved in a finite number of steps. 

The naive evaluation is a bottom-up strategy which follows directly the Fix- 
Point semantics that depends on applying the Tp operator in an iterative way. 
The seminaive method which uses the same approach as naive evaluation but is 
applied to generate new tuples, here corresponds to apply the evaluation only 
to update the tuples. This strategy is nothing else than dynamic programming 
[Sti87]. 



3.1 Fix-Point Semantics of Stratified Normal-|-Partial-Order 
Programs 

We consider here the integration of normal clauses and partial-order clauses. For 
that, we extend the goals that can occur in condition to include equational asser- 
tions of the form f (terms) = X, but the level of / should be less than the level 
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of the function in the head of the given clause. We also accept normal clauses as 
defined in [Llo87] but now they can also include equational assertions in the body 
of the clause. Again, the level of a functional symbol in an equational assertion 
that occurs in the body should be less than the level of the predicate symbol 
that occurs in the head of the clause. We point out that now our interpretations 
include functional atoms as well as the usual predicate atoms. 

Definition 16. Given an interpretation I, we define P := {A{t) = s G / | and 
A is a functional symbol } . And we define to be the complement of respect 
to I, that is r consists of predicate atoms. 

Definition 17. Given two interpretations I, J, define I Gg J ii E and 

r C r. 

Note that Ec defines a partial order on the set of interpretations of any given 
program. 

Definition 18. We say that a model is minimalii it is minimal under the partial 
order Eg. 

As an example consider the following program: 

p > { X } q(X). 

q(l). 

q(2) q(2). 

I := {q(l), q(2), p = {1,2}} is an interpretation of the program, where P := 
(p = {1,2}} and F := {q(l),q(2)}. Moreover, / is a model of the program. 
Another model of the program is J = {q(l),p = {1}} Note that J Eg I and J 
is a minimal model. The program is considered stratified, where the definition 
of q defines the first level and the definition of p defines the second level. So, to 
compute the stratified semantics we first compute the minimal model for: 
q(l). 

q(2) q(2). 

which is {q(l)}. We do this using the standard monotonic operator as defined 
in [Llo87] . Then we compute the minimal model of: 
p > { X } q(X). 

where the semantics of q is known. We do this by using the operator Tp defined 
in this section. We obtain then J, as the stratified semantics of this program. 
We think that any reader with some familiarity with the Fix-Point Semantics 
of standard logic programs can figurated out the final details of the complete 
formalization of the Stratified Normal-I-Partial-Order programs. 

4 Semantics based on normal programs 

The strategy here is to translate a normal-l-partial-order program to a standard 
normal program and then to define the semantics of the translated normal pro- 
gram as the semantics of the original program. We have studied this problem in 
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detail in [OJ97] but we consider here what we claim is the most natural transla- 
tion. Under this translation, none of the well known semantics gives the intended 
meaning to our programs. So, we define a new semantics that, as we will show, is 
interesting itself. A main result is that this semantics gives the intended mean- 
ing to our stratified normal-l-partial-order program. We work in this section with 
the normal form of a program. This form is obtained from the flattened form by 
replacing every assertion of the form f{t) = tl by the atom f={t, tl) and every 
assertion of the form f{t) > tl by f>{t, tl). 

Except for minor changes, the following four definitions are taken from [OJ97]. 



Definition 19. Given a stratified normal-fpartial-order program P, we define 
P' to be as follows: Replace each partial-order clause of the form 
Eq : - condition, Pi, ... , Ek , . . . , En 
by the clause 

Eq : - condition, Ei, . . . , E’^, . . . , En 

where Eq is of the form />(ti,Ai), Ek is of the form g={tk,Xk), E^ is of the 
form g>{tk, Xk) and / and g are (not necessarily different) functions at the same 
level Note that when a clause is strongly-stratified we have k = n. 

Definition 20. Given a program P, we define head(P) to be the set of head 
functional symbols of P, i.e., the head symbols on the literals of the left-hand 
sides of the partial-order clauses. 



Definition 21. Given a program P, a predicate symbol f > which does not occur 
at all in P, we define exti(f) as the following set of clauses: 
f=(Z, S) f> (Z, S), - f better (Z,S) 

fbetterCZ, S) f>(Z,Sl), SI > S 

f>(Z, S) f>(Z,Sl), SI > S 

f> (Z, T) 

f>(Z,C) f >(Z,Ci), f>(Z,C2), lub(Ci,C2,C). 

The first two clauses are given in [Van92]. We call the last clause, the lub clause, 
and it is omited when the partial order is total. And lub(C'i, C 2 , C) interprets 
that C is the least upper bound of C\ and C 2 - Symmetric definitions have to be 
provided for /< symbols. 



Definition 22. Given a stratified normal-t-partial-order program P, we define 
exti( P) ■- U/g head{p) exti(f), and 
transl[(P) := P' L) exti{P), 

The following basic result is given in [OJ97]. 

Proposition 23. For any stratified normal+partial- order program P, transl[ (P ) 
is stratified. 
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As an example of the translation we use program Reach given in example 1.4. 
The relevant clauses of the translated program are: 
reach>(X,Y) sconsCX, 0, Y)^ 

reach>(X, Z) edge(X, Y), reach>(Y, Z) 

reach>(X, 0) 

reach>(Z, S) reach>(Z, SI), SI > S 

reach>(Z,S) reach>(Z, SI), f>(Z,S2), union(Sl , S2 , S) 
reach=(Z, S) reach>(Z, S), ^ reachbetter(Z, S) 

reachf,etter(Z, S) reach>(Z, SI), SI > S 

The following definition is given in [OJ97]. It can be used as a definition of the 
declarative semantics of a stratified normal+partial-order program. 

Definition 24. For any stratified normal+partial program P, we define D(P), 
as the stratified model for transl[{P). 

Consider the program P: a > 1 + a. 

Note that the model-theoric semantics of P (as defined in section 2) defines that 
a=T. On the other hand, the stratified model of transl[(P) interprets a as a 
partial function, that is, for every ground term t, a= (t) is false in D(P). This is 
the only “generic” example where both approaches give a different answer. The 
following is a basic result of this paper and explains formally the above informal 
claim. 

Proposition 25. Let P be a cost-monotonic program. If every function is de- 
fined total in D(P), then D{P) = M{P) restricted to the common language. 

From a computational point of view, the behavior of D(P) is more realistic than 
the behavior oi M{P). 

The problem with this approach (meaning the use of D(P)) is that it only 
works with cost-monotonic programs. It has been shown that non cost-monotonic 
programs sometimes make sense. So, we need a translation that works for a larger 
class of programs. We consider a more direct translation than transl[, that we 
will call transli . Both translations are very similar and closed variants of them 
have been studied in [OJ97,Van92]. 



Definition 26. Given P, we define transli(P) := P U exti(P). 

For our program Reach the new translation is as before, but we replace the 
clause: 

reach>(X, Z) :- edge(X, Y), reach>(Y, Z) 
by the clause: 

reach>(X, Z) :- edge(X, Y), reach=(Y, Z) 

Before we test this (more basic) translation with non cost-monotonic programs, 
we should do it with cost-monotonic programs. This new translation does not 

^ To get rid of the set-constructor that has a variable as an argument in reach>(X, {X}) 
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always give a stratified program from a cost-monotonic program. We showed 
in [OJ97] that neither WFS or STABLE defines the intended semantics for 
it. The apparently simple program Reach, becomes a difficult one to have a 
well defined semantics after the translation. The problem arises when edge in- 
duces a graph with cycles as for instance: Edge 2 :={edge (1 , 2) , edge(2,3), 
edge (3, 2)}. The stable semantics fails with Reach U Edge 2 because it gives 
no stable models at all. The well-founded semantics defines a non total model to 
the program. The problem is not as serious as with the stable semantics since it 
has a partial model consistent with the intended interpretation. But note that 
the WFS agrees in its true/false assignments with the intended model. Some 
undefined values for reach (2) are: 

reach(2) > {3}, reach(2) > {2, 3}, reach(2) = {2}, reach(2) = {2, 3}. 

An interesting point is that the well-founded model agrees with the intended 
model in the assignments of many false values. For instance, reach>(2, {!}) is 
false in the partial model. This “decision” pruned all the unacceptable “large” 
models of the program. 

What about the Clark’s completion? How does it behaves with respect to 
our program Reach? The Herbrand models of the completion of our example 
are the following: 

1. The intended model, i.e., when reach=(2, {2, 3}). 

2. Models where there exists s such that reach=(2, s) is true, but excluding 
the intended model. There is a model of this kind for each s such that s 
> {1,2} is true. Call this class of models Cd- 

3. Models where for every s, reach=(2,s) is false. Let M be a model of this 
kind. Then, there exists a non-terminating sequence of ground terms sq , si , 

. . ., Si, such that < s^+i and reach>(2,Si) is true. Call this class of 
models C„. To see that the claim is true, we take sq=T. Then reach> (2, sq) 
is true, and by hypothesis reach=(2,so) is false, i.e., ^ reach=(2,so) is 
true. By the clause that defines reach=, we get that reachf,etter(2, sq) is 
true and so , by the converse of the caluse that defines reachhetter, it exists 
Si such that, sq < si, reach>(2,si) is true. We now can apply the same 
argument taking si instead of sq to show that there exists S 2 such that, si 
< S 2 , reach>(2,S2) is true. We can apply the argument forever. 

So, we have many more models (Cu U Cd in this case) than expected. This is 
because negation is weak for COMP. Next, we show how to combine WFS and 
COMP to get the intended model. We remind the reader the following well 
known fact: Give a program P, M is model of COMP(P) iff M is a supported 
model of P. 

Definition 27. We define the semantics WFSCOMP(P) as the set of liter- 
als that are true in every two-valued supported Herbrand model that extends 
WFS(P). Any such extension is a model that agrees with the true/false assig- 
ments given by WFS. If no such model exists then WFSCOMP(P):=WFS(P). 
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The following is another result of this section. It is based on the observation 
that under the given conditions D(P) is the unique model of COMP(transli (P)) 
that extends WFS(transli(P)). 

Theorem 28. For every stratified normal+partial- order program P, such that 
every function is defined total in D(P), D(P)=WFSCOMP(transl\(P)). 

We conjecture that the condition every function is defined total in D(P) is in 
fact no required here. But it is direct to see that the theorem is true under the 
given condition. 

What can we say about the behavior of WFSCOMP for general normal pro- 
grams? 

Let us consider the following example considered elsewhere, which is repre- 
sentative for the problems with reasoning by cases. Let P be 

b ^ ->a 
p ^ a 
p^ b 

Several authors have argued that since neither a, nor b can be derived in any 
semantics based on two- valued models the disjuntion a V 6, thus also p should 
be true. WFS(P) does not fulfill this point. STABLE as well as COMP derive p. 
So thus our proposed semantics. 

STABLE on the other hand is inconsistent for many programs, this occurs 
when either STABLE has several stable models but even worst when it lacks of 
stable models. Consider the following program P: 
b ^ a 
a ^ b 
a ^ ^b 

Then P does no have any stable model. One can argue by reasoning by cases (on 
b) that a should be a consequence of P. Once accepted this fact we observe that 
b is also a consequence of the program. So, the intended model of P is {a, b}. 
This is what COMP(P) defines, as well as our proposed semantics. 

We argue that COMP is very weak to infer negative literals but not so to infer 
positive literals. On the other hand, WFS is strong enough to infer many negative 
literal but very weak to infer positive atoms. In general, STABLE derives many 
literals and so it becomes inconsistent in cases where we argue that there is an 
intended model. We now see a typical (well known) example where COMP fails 
to give the intended semantics of a program. 
edge(a,b). 
edge(c,d). 
reachable(a). 

reachable(X) ^ reachable(Y), edge(Y, X). 
unreachable (X) <— ^reachable(X). 

Here, edge (a, b) means that there is a directed edge from a to b. 

We obviously expect vertices c , d to be unreachable, and indeed, Clark’s seman- 
tics implies it, i.e.. 
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comp(P)\= unreachable (c) and 
comp(P)\= unreachable (d) 

Suppose we add to P the clause edge(d, c) and call the resulting program P'. 
Although we still expect that c and d are to be unreachable, the Clark’s semantics 
of P' does not imply that c and d are unreachable. This example illustrates well 
why COMP is weak to infer negative literals. Our proposed semantics on the 
other hand gives the intended model. We consider that WFSCOMP is a good 
combination of WPS and COMP. 

5 Conclusions 

We introduced the fix-point semantics of partial-order programs. The seminaive 
method to evaluate the fix-point corresponds basically to dynamic programming. 
In this way we see that it is possible to obtain an efficient computation of partial 
order programs. We saw that this paradigm can be integrated with standard 
logic programming. We also defined a new declarative semantics of normal pro- 
grams that in general appears promising. It can be used to give a semantics of 
normal-kpartial-order programs, using a direct translation of these class of pro- 
grams to normal programs. Our general claim is that partial-order programming 
is useful and should be integrated to logic programming. 
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Abstract. We give a general approach to characterizing minimal infor- 
mation in a modal context. Our modal treatment can be used for many 
applications, but is especially relevant under epistemic interpretations 
of an operator. Relative to a modal system S, we give three character- 
izations of minimality of a formula ip and give conditions under which 
these characterizations are equivalent. We then argue that rather than 
using bisimulations, it is more appropriate to base information orders 
on Ehrenfeucht-Frai'sse games to come up with a satisfactory analysis of 
minimality. Moving to the realm of epistemic logics, we show that for one 
of these information orders almost all systems trivialize, i.e., either all 
or no formulas are honest. The other order is much more promising as it 
permits to minimize wrt positive knowledge. The resulting notion of min- 
imality coincides with well-established accounts of minimal knowledge in 
S5. For S4 we compare the two orders. 



1 Introduction 

This paper offers a general account to the issue of minimal informational content 
of modal assertions. This issue is perhaps most prominent in the area of modal 
epistemic logic [6] where researchers have addressed questions like what it means 
that one agent knows more than another one, or whether it makes sense to 
claim that one ‘only knows that (p\ To demonstrate that such questions are 
non-trivial, note that in an epistemic system with negative introspection, the 
assumption that one agent can know more in one state than in another yields 
a contradiction, since in such a case, in the second state, the agent would have 
knowledge about his ignorance (‘I know that I don’t know . . . ’) that cannot be 
shared in the first state. And, in the case of only knowing, it seems defensible to 
argue that one can honestly claim to only know (that one knows) some atomic 
fact p — which makes p an honest formula — whereas ‘I only know that I know 
p or that I know q' does not seem to be acceptable. 

Studies of ‘only knowing’ ([6,14]) and ‘all I know’ ([11]) have largely been re- 
stricted to particular modal systems, such as S5, S4 and K45. Recently Halpern 
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[5] has also taken other modal systems such as K, T and KD45 into account. 
Although his approach suggests similar results for e.g. KD4, we would like to 
adopt a more general perspective: given any modal system, how to characterize 
the minimal informational content of modal formulas. Besides arbitrary normal 
systems we prefer to use standard Kripke models, instead of Fagin and Vardi’s 
knowledge structures, and Halpern’s tree models. 

Our approach is motivated by the question what kind of conclusions can 
be derived from a modal premise Lp, if this premise is understood as minimal 
information. For instance, ii — Up one would derive <>q under this reading of 
premises (if one only has the information that p, any independent fact q may be 
possible), whereas in ordinary modal logic Oq is not derived from Up.^ 

There are three ways to study the question whether a formula p allows for 
such a minimal interpretation, and, if so, what can be said about the conse- 
quences of p under this interpretation. 

The first approach is a semantical one: Given a formula p, try to identify 
models for p that carry the least information. This approach requires a suitable 
order between states (i.e., model-world pairs) in order to identify minimal (or, 
rather, least) elements. For the simple (universal) S5-models the order coincides 
with the superset-relation between sets of worlds. Our challenge here is to give 
a general definition of such an order, which also suits other modal systems. 

The second approach is mainly syntactic in nature and presupposes a sublan- 
guage C* of ‘special’ formulas. Given a consistent formula p, we then try to find 
a maximally consistent set containing p with a smallest £*-part. This approach 
can be identified as the search for so-called stable expansions, which are related 
to maximally consistent sets in a straightforward way. Since consistency is de- 
fined as a deductive property, and maximally consistent sets pop up as canonical 
states, there is also a deductive and semantic flavour to this approach. 

The third and last approach is purely deductive, and is also known as the 
disjunction property: p allows for a minimal interpretation if for any disjunction 
in T* that can be derived from p, one disjunct is derivable from p. So, loosely 
speaking, in choosing between a number of cases, p forces a decision. 

There are several routes to interconnect these approaches. One route is from 
syntax to semantics. Here the main concern is to find orders on states that 
preserve the truth of formulas in C* . Note how such an order models growth of 
information. The reverse route, from semantics to syntax, starts with an order 
between states, and tries to identify a suitable persistent sublanguage £* . Our 
main contribution here is to define orders between models by imposing ‘back’ 
(preserving knowledge) and ‘forth’ (preserving uncertainty) clauses in a way 
intimately related to Ehrenfeucht-Frai'sse games, giving us a powerful mechanism 
to fine-tune the order for broad classes of modal logics. 

After giving some technical preliminaries in Section 2, in Section 3 we for- 
mally link up the three approaches to minimality, and, for two specific orders, 

^ Note that in the partial logic advocated in [8] Oq does not follow from Dp even 
when the latter is interpreted as minimal information, because Oq is only true in the 
presence of positive evidence for q. 
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we identify the corresponding languages. In Section 4 we evaluate our notions of 
minimality for epistemic systems. We round off with a conclusion section. 

2 Preliminaries 

Our language L is just that of modal logic, including the modal operators □ and 
its dual O. We assume that formulas (/? and xp in C are composed from a finite 
set of propositional atoms V = {p,q,r . . .}, using the modal operators and the 
classical connectives. A special atom _L is defined as (p A ^p), whereas T = ^_L. 

The function d : £ — > N calculates the modal depth of formulas as follows: 
d{p) = d(T) = d(T) = 0 (p G V), d(“'p) = d(p), d{p-k'tp) = max{d{p),d{'tp)} 
for * = A,V,— and, finally, d(np) = d{0(p) = 1 + d{(p). Some properties 
to be presented are relative to a given subset £* C £. An example of such a 
sublanguage of £ is £(„) = {p G £ | d(p) < n}. For a unary operator A = ->, O, □ 
and language £* C £, A£* = {Ap | p G £*}. We also use an ‘inverse’: A~£* 
denotes {a | A a G £*}. 

We use Kripke models (W, R, V) as a standard interpretation of the modal 
language. Instead of Rwv, we also write v G R[w]. Here, the key-notion is the pair 
(M, w) (often written as M, w), also called a state, in which each modal formula 
p receives its standard interpretation with the typical modal case: M,w Dp 
iff for all V G one has M, v \= p. For £ C £, M, w \= F means that for all 
7 G £ : M,w \= Relative to a given set of models S, consequence is defined 
by £ ^5 p iff for all M G 5: M, tc (= £ implies M, w \= p. 

A main question in this paper is how truth is preserved by moving from one 
state to another. Let £* C £ and let S' be a set of states. We say that an order 
< on S preserves the sublanguage £* or that £* is persistent over < in S, iff 

M,w < M' , w' ^ for all p G £* : (M, w \= p ^ M' , w' ^ p) 

If the overall converse holds, we say that £* characterizes < on S. 

We will discuss several logical systems S on top of the minimal modal system 
K, assuming familiarity with the notion of derivability in a modal system S. In 
particular, for a set of premises £, we write £ hs p if there is a derivation 
of p (without applications of necessitation to the premises) from £ in S. The 
formulas p and are equivalent in S, or S-equivalent, if both p Fs "0 and Fs p. 
The logic S is called finitary for a sublanguage £* if it induces finitely many 
S-equi valence classes in £*. As an immediate consequence of our assumption 
that V is finite, we have that every S we will consider is finitary layered, in the 
sense that for each n G N, S is finitary for £(„). 

The generalisation that we use to lift Fs from a subset of 2^ x £ to a subset 
of 2^ X 2^ is more analogous to the one used in sequent calculi (cf. [10]). than 
to our Hilbert-style presentation of a logical system: 

£ Fs A <tA 3i5i . . . G A : £ Fs (A V • • • V i5„) 



if A 0, and £ Fs 0 if £ Fs T. 
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The minimal system K contains the rule of necessitation (0 h (/? 0 h □(/?) 

and the modal axiom K : □((/?^ ip) (□(^— > Dip) on top of any Hilbert- 

style axiomatization of propositional logic. Here, we are interested in normal 
systems S, i.e. systems that are obtained by adding modal axioms to K. One 
such extension, T, is obtained by adding the axiom T : □(/? — !■ (/? to K. The 
system KD is named after the axiom that distinguishes it from K, which is 
D : OT or, equivalently, 0(p. In epistemic logic, systems that include 

axiom 4 : Dtp ^ UUip are called positively introspective, those that have axiom 
5 : -^D(p D^Dip are called negatively introspective. Two other axioms worth 
mentioning are B-. ip DOip and G: <>Dip — s- UOip. Any combination of the 
axioms mentioned is called an epistemic logic, here. Typical examples of such 
systems are S4 (T + 4 ), S5 (S4 + 5 ) and S4.2 (S4 + G). 

When □ is interpreted as belief, the axiom T is often replaced by D, which 
gives rise to systems that obtain their name directly from the constituting ax- 
ioms: KD, KD4, KD45, etc. 

The set of states verifying S (S-states for short) is called States. For a given 
formula tp we define States (:/?) = {{M, w) € states | M, w \= p}. 

Given a logic S we say that p G £ satisfies the S- disjunction property (S-DP) 
over a sublanguage £* , ii p is S-consistent and for every ipi,ip 2 , ■ ■ - ipk & £*■ 

P Fs (V’l V • • • V ipk) for some i < k : p hs tpi 

A set of formulas T is S-consistent if T I/s T, and maximally S-consistent 
(S-m.c.) if it moreover contains all the formulas p for which TU{(/?} is consistent. 
All S-m.c. sets together constitute the set of possible worlds VFs in the canonical 
model Ms = (Ws,i?s,Vs) for S. Thus, maximal consistent sets play a crucial 
role when proving (strong) completeness of S wrt any class of models S that 
contains Ms: F p => F 'ds P- The converse of this implication is called 
(strong) soundness of S wrt S. 

Many classes of models have been identified that are sound and complete wrt 
the systems S that were mentioned above (see [3]). Most significant are those 
classes of models that are determined by a property of their accessibility relation. 
For instance, for KD one takes the serial (i.e., yx3yRxy) Kripke models, for T 
the accessibility relation is reflexive, in KD4 it is serial and transitive, in S5 it 
is an equivalence relation. 



3 Minimal information in modal logic 

Let S be an arbitrary modal system, < a pre-order on states. A formula p is 
called honest with respect to S and <, if there exists a least S-state verifying 
Dp. More precisely, p is S-honest (for <) iff there is an S-state M, w such that: 

— M,w \=: Dp, 

— M' , w' \^Dp ^ M, w < M' , w' for all (M', w') € states. 

In Section 3.1, we will give independent characterizations of honesty. 
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3.1 General characterizations of minimality 

Let £* be a sublanguage of C. We consider the following approaches to mini- 
mality: 

(1) Formula ip has a <-least verifying S-state 

(2) Lp has an £*-smallest m.c. expansion 

(3) ip has S-DP with respect to C* 

The next result, which is visualized in Figure 1, relates the three approaches. 

Theorem 1. Let C* be persistent over <. Then the minimality approaches (2) 
and (3) are equivalent, while (1) implies both (2) and (3). 




Fig. 1. Relating least states, expansions and disjunction properties wrt C* 



The proofs of this and the next theorem invokes the canonical S-model and 
exploits the usual properties of m.c. sets. Without going into details, the equiv- 
alence of (2) and (3) is mirrored in the known fact that S is an S-m.c. set iff 
both 

— 17 is deductively closed, i.e. S hs tp ^ tp & S, and 

— E has S-DP, i.e. E is S-consistent and for all 'ipi, . . . ,tpk € £: 

17 hs V’l V • • • V V'fe =k T" hs V’j for some i < k. 



Persistence does not guarantee the implications (3) ^ (1) and (2) ^ (1) to 
hold in general. They can be established by adding the converse of persistence. 
Let us say that in that case the minimal information equivalences hold for L* 
and <. 

Theorem 2. Let C* be a persistent sublanguage of C which also characterizes 
<. Then the minimal information equivalences hold for C* and <. 



For full proofs we refer to the forthcoming technical report. 



2 




Persistence and Minimality in Epistemic Logic 



67 



Corollary 3. Let £* C £ be persistent and characterizing for <. Then the 
following propositions are equivalent: 

— is S-honest for < 

— Uip has an £*-smallest m.c. expansion 

— □(/? has S-DP over C* 

In the literature there is a lot of emphasis on stable sets; in our terminology a 
stable set is simply the knowledge contained in an m.c. set, or more formally: S 
is stable if 17 = for some m.c. T. The second condition for ip being honest 
can thus be rephrased as: 

— ip has a □^£*-smallest stable expansion^ 

Although this solves the problem of alternative characterization of honesty in 
an abstract sense, the solution is not entirely satisfactory. Most importantly, it 
is unclear whether suitable orders exist that enable persistent sublanguages to 
characterize them. And, if so, we would like to specify them in an independent, 
insightful way. With this end in view we propose several specific orders in the 
next subsections. 

3.2 Bisimulation and minimality 

A first idea which jumps to mind when using preservation results in charac- 
terizing minimality, is to employ the notion of bisimulation. Bisimulations are 
well-known structural descriptions of equivalences between states [2] . 

Definition 4. Let M = (W, R, V) and M' = {W , R' , V) be two Kripke models. 
A bisimulation B is a non-empty relation B C W x W such that for all w € W, 
w' gW' with Bww': 

— P(u,) = V{w') 

— if Rwu for some u G W, then there is a tt' € W with R'w'u' and Buu' (forth) 

— if R'w'u' for some u' G W', then there is a tt € W with Rwu and Buu' ( back) 

If there exists a bisimulation B between the states {M,w) and (M',w'), we say 
that the two states bisimulate and we write (M,w) =b (M',w') 

Now, the following well-known result ([2]) guarantees that any language C* 
is invariant (i.e., two-way persistent) over bisimulations. 

Theorem 5. For ai\ p G C and all states (M, w), (M', w') one has: 

(M, w) =b (M', w') => (M, w \= p M' , w' \= p) 

Nevertheless, the existence of a bisimulation is not a necessary condition for 
modal equivalence of two states. Consider the pair of models in Figure 2, with 
every world having the same local valuation of propositional variables. 

® Our direct definition of ‘stable expansion’ generalizes the notion characterized by 
Moore’s fixpoint equation in [13]. 
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Fig. 2. Two models with the same theory 



For every natural number n the model M has a branch of length n. M' is 
similar to M , except that it also has an infinitely long branch. The world 0 
verifies the same formulas in both models, but on the other hand, a bisimula- 
tion between those two states cannot be given. Similar difficulties arise when we 
implement weakenings of the notion of bisimulation (e.g., dropping the ‘forth’- 
requirement). It turns out that in the most intuitive adaptations of bisimulation 
for ordering states on their informative content, the state (M, 0) in the figure 
above remains a proper extension of {M', 0), while they contain the same infor- 
mation. A refinement of such orders is therefore needed. 



3.3 Ehrenfeucht-Ftaisse orders 

We will now inspect orders inspired by Ehrenfeucht-Frai'sse games. ^ These orders 
are defined by means of underlying, ‘layered’ pre-orders. To make the connection, 
we will present a general lemma that paves the way. 

Suppose <" is a pre-order on states for each natural number n (‘layer n’). 
Moreover, let < be defined by: {M = (VF, R, V) and M' = (IT', i?', V')) 

M,w< M', tc' ^ Vn G N Vv' G R'[w'] 3wG i?[tc] : M, w <” M', v' 

Finally, let C* be a sublanguage and = C* H £(„) be its subset of formulas 
of modal depth up to n. 

Lemma 6 (collecting). If is persistent and characterizing for <", and C* 
is closed under V, then □£* is persistent and characterizing for <, i.e. 

M,w < M' , w' G □£* : (M, w \= ip ^ M' ,w' \= ip). 

With this tool we study some important Ehrenfeucht-Frai'sse orders. 

^ See [4] for the use of Ehrenfeucht-Fra'isse games in first-order predicate logic, in which 
modal logic can obviously be embedded [2]. 
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General information order. In the first Ehrenfeucht-Frai'sse order the underlying, 
layered order is an equivalence relation. The relation is defined recursively 
by: 



- M',w' ^ V{w) = V'{w') 

r M, w M', w' & 

— M, M',w' <;==> < Vu' e (back) & 

Vu G i?[w] 3v' G R'[w'\ : M, u M', v' (forth) 

Then it can be shown that two states are ~”-equivalent iff they verify the same 
formulas up to depth n: 

M, w M', w' ^ y(fiG Ci^n) ^ M' , tc' h ¥’)• 

From left to right the external equivalence expresses persistence over from 
right to left it says that £(„) characterizes 

Now we define the general information order C based on stratified Fhren- 
feucht-Frai'sse equivalence: 

M,wG M', tc' ^ Vn G N Vn' G R'[w'] 3 vgR[w] : M, n M' , v' 

Since £(„) is persistent and characterizing for lemma 6 shows that □£ is 
both persistent and characterizing for C. 

Lemma 7 (modal characterization of C). 

M,w G M', w' V 99 G □£ : {M, w \= (p ^ M\ w' \= p). 

So, by Theorem 2, we obtain the minimal information equivalences for the 
general information order. 

Theorem 8. For any S, the minimal information equivalences hold for C and 
□£. 

By Corollary 3 this implies that p is honest for GiEUip has an D/l-smallest m.c. 
expansion, i.e. p has a smallest stable expansion. And in terms of disjunction 
properties, p is honest iff □(/? has S-DP over the full language. As we will show, 
C is not a proper order for many epistemic systems. Therefore we introduce yet 
another Ehrenfeucht-Frai'sse order. 



Positive information order. The second order is based on a genuine stratified 
pre-order. The relation is defined recursively by: 



- M,w^° M', w' 

- M,w ^”+1 M', 



V{w) = V'{w') 

fM,u; M',w' & 

\ Vu' G R'[w'] G : M, v M', v' 



(back) 
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Notice the ‘forth’ direction is typically missing here. Next we define the positive 
information order ^ based on the stratified Ehrenfeucht-Frai'sse pre-orders: 

M,wdi M', tc' ^ Vn e N Vw' e R'[w'] 3uG : M, u M', v' 

In epistemic terms, the order preserves positive knowledge. We define the 
relevant sublanguage by: 

= {(/? G £ I contain neither □ in the scope of -> nor O} 

So DpV Dg, and DpA are members of £+, but and OpV □<; are not. 
By defining O as the sublanguage £+ amounts to the closure under A, V 

and □ of propositional formulas. Recall that = £+ n £(„). One can prove, 
by induction, the following persistence and characterization result for 

M, w ££ M', w' ^ V(pG £+ ) : (M, tc h £ ^ M' , w' h £) (1) 

Notice that on the right hand side of (1), we do not have an equivalence now; 
this is due to the fact that is not closed under -i (if n > 0). 

Again, by the strong persistence property (1) and the collecting Lemma 6, 
we can easily prove that □£+ is both persistent and characterizing for A. 

Lemma 9 (modal characterization of A). 

M, tc A M' , w' <l=A Vp G □£+ : (M, w\= p ^ M' , w' \= p)- 



Theorem 10. For any S, the minimal information equivalences hold for A and 
□£+. 

The latter theorem follows immediately from Lemma 9 and Theorem 2. Us- 
ing Corollary 3, it implies that p is honest for A iff Dp has a nU^-smallest 
m.c. expansion, i.e., iff p has an £+-smallest stable expansion. And in terms of 
disjunction properties, p is S-honest for A iff Dp has S-DP over □£+. 

A simple, yet important case of the positive information order is the submodel 
relation. (M',w) is a submodel of {M,w) {M,w A M',w) iff W C W, B! = 
R n W' X W and U'(u) = V{u) for all u G W . As a consequence of the Los 
theorem in first-order logic, a modal formula is preserved under submodels iff it 
is equivalent to a ‘positive knowledge’ formula, i.e. a formula in £+ (see [12] and 
[1, thm.2.10]). Since C £+ this implies, using Lemma 9, that 

Corollary 11. 

1. A preserves £+ 

2. If M, w U M' , w then M, w A M', w' 

Since the submodel relation is easily established, this provides a convenient tool 
for proving that two models are related by the positive information order. 
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4 Evaluating minimality in epistemic systems 

In this section we evaluate the two information orders introduced before in the 
light of our broad class of epistemic systems, as introduced in Section 2. In fact, 
we will first prove a negative result for an even larger class. To this purpose we 
define the notion of a Geach logic, see [3]. This is a normal modal system which 
contains (in addition to K) axioms of the form: 

0^o'-(p with k, I, m,n G N, (2) 

where is defined recursively: O'^ip = p> and 

The regularity of Geach logics is caused by the general correspondence be- 
tween an axiom of the form 2 and the class of models in which the accessibility 
relation is k, I, m, n-confluent: 

Wx,y,zGW : {R'^xy & R'^xz) ^3 wgW : {R^yw & R^zw). (3) 

Here R^ is simply the identity relation, and = R o R^ , the relational 

composition of R and R^ . 

Every Geach logic corresponds to a conjunction of relational restrictions as 
given in (3). One easily verifies that all our epistemic logics are Geach logics. 



4.1 General minimality 

The general information order specifies that one world is smaller than another 
world if and only if in the first world less information is true than in the second. 
It turns out that for most systems this order is not appropriate. In most Geach 
logics this order trivializes the notion of general honesty: either all formulas are 
honest, or (nearly) all formulas are dishonest. 

Trivial honesty. For weak modal systems such as K, K4, KD and KD4 it can 
be proved by a simple model-theoretic technique that all formulae p such that 
□ is consistent are honest with respect to the general information order. This 
technique is called simple amalgamation. For two S-states a simple amalgama- 
tion is constructed by adding one world from which all worlds are accessible 
which are accessible from the original two states. The construction is depicted 
in Figure 3. For every formula p we obtain: 

(M, tc ^ □ (/? & M', w' ^ □ (/?) <^ M* ,w* ^ (4) 

Let us suppose that the class of S-models is sound and complete wrt S and 
that this class is closed under amalgamation — which is the case for each of these 
weak modal systems. This proves the disjunction property of any consistent □(/? 
over □£, by using contraposition: Suppose that □(/? I/s ^pVs then 

we can find S-states with M, w |= and M' , w' |= □(/? A By (4) 

we know that M*,w* \= G\p, but obviously, also M*,w* |= ^(□/'i V □V’ 2 )- In 
other words, □(/? I/s tn/’i V □V’ 2 - 
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Fig. 3. A simple amalgamation of the states (M, w) and (M', te'). 



This argument shows that in every system S with only axioms of the form 

□ ^ with l,m + n > 0 trivial honesty occurs, since the semantic 

conditions for such Geach logics are preserved under simple amalgamation. 

Trivial dishonesty. On the other hand, for many Geach logics the notion of 
full honesty deflates seriously. Such systems often contain theorems of the form 

□ V 0(fi2 where □(/?i and □(/?2 are not theorems (*). In particular, systems 
which incorporate Geach axioms with k,m > 0 have the property (*).^ The 
disjunction property is then easily violated. In fact, one can prove the following. 

Theorem 12. For any epistemic logic S except for T, S4, and the weak systems 
K, KD, K4, and KD4, no S-consistent formula is honest. 

Systems without consistent honest formulas are K45, KD45 and S5, but 
also those with a milder form of negative introspection such as S4.2. 

Remnants of full honesty. From the previous paragraphs it follows that the only 
epistemic systems which do not suffer from trivial honesty or trivial dishonesty 
are T and S4. A simple propositional variable p (a fact) is both T- and S4- 
honest with respect to the general information order, while Dp V □(/ is dishonest 
in these two systems. 

A semantic test for full honesty in such systems as T and S4 is provided 
by the notion of rootability [9], which is defined on the basis of reflexive amal- 
gamation. A reflexive amalgamation of two states is defined in the same way 
as a simple amalgamation with the only exception that the ‘root world’ w* is 
taken to be reflexive: R*w*w*. A formula ip is called rootable if for every pair 
of Dp-states, a reflexive amalgamation can be found such that □(/? holds in the 
root world. Rootability implies honesty, but not the other way around. Never- 
theless, for every system which is closed under reflexive amalgamation, such as 
S4 and T, every positive knowledge formula is honest with respect to the general 
information order if and only if it is rootable. So, for example, p is rootable. 

® Another set of Geach axioms satisfying (*) are B and the like (fc,Z = 0, m > 0). 
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4.2 Positive minimality 

As we have seen the systems T and S4 permit non-trivial honest and dishonest 
formulas. However, there are good reasons to question the feasibility of the notion 
of full honesty. To begin with, full honesty cannot serve as a good general notion 
of honesty, since in many systems this notion trivializes, as we have seen in 
the previous section. Moreover, for epistemic purposes it seems intuitively more 
sound to exclude formulas which represent ignorance, i.e., formulas of the form 
when it comes to minimizing knowledge. 



p -'P 




p 


• 




• 


-^1 U>1 Vl 







Fig. 4. Two S4-models. 

To understand the problem, consider the two models above (in all figures, 
we omit reflexive arrows). Intuitively, one would say that the agent knows more 
in state {M 2 ,W 2 ) than in {Mi,wi), since the agent considers less possibilities in 
(M 2 , VJ 2 )- However, {Mi,wi) is not smaller than (M 2 , W 2 ) in terms of the general 
information order. In the first configuration the agent knows that he does not 
know that p, while in the second he does not know that he does not know that p, 
since he knows that p. This shows that the general information order on possible 
worlds does not fit in with our intuition that ‘more knowledge’ corresponds to 
‘less uncertainty’. 

For the system S5, the restriction to positive minimality turns out to be 
equivalent with the original analysis of honesty in [6] . In fact a more restricted 
version of minimality is given in [6], viz. with respect to the language (fac- 

tual knowledge). However, it can be shown that in the system S5 the disjunction 
property with respect to this restricted language is equivalent to the disjunction 
property with respect to the language of positive knowledge formulas. 

For some modal systems such as S4, in which neither general nor positive 
minimality trivializes, it is interesting to compare the two orders. First let us 
note that for arbitrary normal systems general honesty implies positive honesty 
(this easily follows from the disjunction properties): 

Theorem 13. For any normal system S, if tp is S-honest wrt C then tp is also 
S -honest wrt 

However, a similar transfer between different modal systems (and one kind 
of honesty) is not easily obtained. It may therefore be illuminating to contrast 
general and positive honesty for S4 with positive honesty for S5. Table 1 displays 
formulas which are honest {^) or dishonest (— ) in the indicated sense. 

From Theorem 12, we know that there are no (consistent) formulas that are 
generally honest in S5. Also, Theorem 13 explains why there are no witnesses 





74 



Wiebe van der Hoek, Jan Jaspars, and Elias Thijsse 



case 


formula 


S4gen 


S4pos 


S5pos 


1 


pV q 




a/ 


\/ 


2 


□p V 5 


a/ 


a/ 


- 


3 


none 


V 


- 




4 


none 


a/ 


- 


- 


5 


Op V nOg 


- 


a/ 




6 


Op V DODg 


- 


a/ 


- 


7 


(□(□p V g) A -iDg) V n(p V r) 


- 


- 


\/ 


8 


Op V Oq 


- 


- 


- 



Table 1. Several (dis-)honest formulas for S4 and S5 



for the cases 3 and 4 in the table. Cases 5 and 6 show that Theorem 13 cannot 
be strengthened to an ‘if and only if’ statement. Finally, note that there is 
no relationship between positive honesty in S5 and either general or positive 
honesty in S4. 

For illustrative purposes, let us prove both a positive and a negative entry in 
Table 1. To start with, let us consider tp = Dp V nOg. In order to demonstrate 
that Dp has the S4-DP with respect to □£+, let a,/3 € £+. To prove that 
Dp hs4 Da V D(i implies that either Dp hs4 Da, or Dp hs4 □/?, we argue 
by contraposition. Hence assume that Dp |/s4 Da, and Dp Dfi. Using 
completeness, we find two S4-models M = {W,R,V) and M' = {W' ,R' ,V), 
with 

M,w\^ Dp /\ -^Da, and M' , w' ^ Dp /\ -^D(3 

Thus, there must be v and v' such that M,v \= p ^ ^a, and M' , v’ \= p h -i/3 
(see Figure 5). Now, we build a new model M* out of M and M' as follows. 

To the reflexive amalgamation of the two models, we add a ‘common ceiling’ 
u in which q is true, i.e. let 

IT* = WUW'U{w*,u}, 

R* = i? U i?' U (W* X M) U ({w*} X W*), 

and V* equals V on W, V on W , V*{q, u) = 1 and V* on w* is arbitrary. M* is 
an S4-model, since it is both reflexive and transitive. Because M*,u ^ q, we have 
that that M*, w* \=^Dp. Finally M* , w* ^ DaV n/3, for suppose that M*,w* \= 
Da, then in particular M*,v ^ a. Since M is carefully constructed to be a 
submodel of M*, by the corollary of the Los theorem: M, v \= a, contradictory 
to assumption. Thus M*,w* ^ Da. By a similar argument, using the fact that 
M' is also a submodel of M*, M*,w* ^ n/?. In all, M*,w* DayDjS, hence by 
soundness Dp DaVDfi. Since □(/? is obviously S4-consistent, this completes 
the proof that p is S4-honest wrt the positive information order. 

To illustrate the derivation of a negative result in Table 4.2, let us consider 
the first ‘— ’ at line 7. Let p = (□(□p V g) A V n(p V r). We will disprove 
the S4-DP for p. First of all, it is easily verified that Dp Fs4 (DpV g) V □(pV r). 
However, the two models M and M' of Figure 6 illustrate that Dp ^g4 (DpV q) 
and that Dp ^§4 □(pV r), respectively. 
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Fig. 5. Adding a common ceiling and root to two S4-models 
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~^p, ~^q, r 





M' 
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^ • 


g, 


p, r 



Fig. 6. Two counter models 



5 Conclusion 

Until now research in the notion of minimal information was mainly devoted to 
particular modal logics such as S4 and S5. Most often authors also used non- 
standard semantics and specialized techniques to model minimal information. 
In this paper we offered a general approach to the representation of minimal 
information on the basis of standard Kripke models for arbitrary normal modal 
logics. 

The key idea is the use of preservation results for modal logic, and can be 
sketched as follows. A formula expresses minimal information iff there is a least 
verifying model; this presupposes an order wrt which the model is minimal. Next 
determine which formulas are preserved by this order, i.e. which formulas remain 
true when moving to a ‘greater’ (more informative) model. If this sublanguage 
also characterizes the order, this is a suitable information order. Such an or- 
der and the sublanguage it preserves then provides precise syntactic, deductive 
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and semantic criteria for minimality which can be used as equivalent tests for 
representation of minimal information by single, so-called honest, formulas. 

In the context of epistemic logic this analysis led to the choice of the positive 
Ehrenfeucht-Frai'sse order on models for the semantic analysis of minimality 
and the sublanguage of positive knowledge formulas for the syntactic and the 
deductive description of minimality. The conclusion is that this description offers 
an adequate analysis of minimality for epistemic systems. 

One may argue that for this analysis trivial honesty (i.e. all consistent for- 
mulas are honest) still occurs on the level of weak modal logics such as K, KD, 
K4 and KD4, and that therefore, our choice is not adequate for those systems. 
In epistemic logic however these systems are simply too weak since in these 
systems a set as, for example, {□(□p V n^p), is consistent, which 

seems unacceptable when □ represents some epistemic attitude (an argument 
against Hintikka’s KD4 axiomatization of belief [7]). It is implausible that an 
agent may have the information that he has the information whether p, while 
on the other hand, he doubts whether p is the case. 

Needless to say the generality of our approach may inspire further research 
in this area. One obvious topic is the application of our results to particular 
systems, such as extensions of S4 like S4.2 and S4F. Other lines of future 
research may be the adaptation of our techniques to multiple agent systems and 
the move to partial semantics. 
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Abstract. In this paper we introduce Prohairetic Deontic Logic (PDL), 
a preference- based dyadic deontic logic. An obligation ‘a should be (done) 
if (3 is (done)’ is true if (1) no ^aA/3 state is as preferable as an aA/3 state 
and (2) the preferred f3 states are a states. We show that the different 
elements of this mixed representation solve different problems of deontic 
logic. The first part of the definition is used to formalize contrary-to- 
duty reasoning, that for example occurs in Chisholm’s and Forrester’s 
notorious deontic paradoxes. The second part is used to make dilemmas 
inconsistent. PDL shares the intuitive semantics of preference- based de- 
ontic logics without introducing additional semantic machinery such as 
bi-ordering semantics or ceteris paribus preferences. 



1 Introduction 

Deontic logic is a modal logic, in which absolute and conditional obligations are 
represented by the modal formulas Oa and 0{a\(3), where the latter is read as 
‘a ought to be (done) if is (done).’ It can be used for the formal specification 
and validation of a wide variety of topics in computer science (for an overview 
and further references see [44]). For example, deontic logic can be used to for- 
mally specify soft constraints in planning and scheduling problems as norms. 
The advantage is that norms can be violated without creating an inconsistency 
in the formal specification, in contrast to violations of hard constraints. With 
the increasing popularity and sophistication of applications of deontic logic the 
fundamental problems of deontic logic become more pressing. 

From the early days, when deontic logic was still a purely philosophical en- 
terprise, it is known that it suffers from certain paradoxes. The majority of the 
paradoxes evaporates under close scrutiny yet some of them - most notably the 
contrary-to-duty paradoxes - persist. The conceptual issue of these paradoxes 
is how to proceed once a norm has been violated. Clearly, this issue is of great 
practical relevance, because in most applications norms are violated frequently. 
Usually it is stipulated in the fine print of a contract what has to be done if a 
term in the contract is violated. If the violation is not too serious, or was not 
intended by the violating party, the contracting parties usually do not want to 
consider this as a breach of contracts, but simply as a disruption in the execution 
of the contract that has to be repaired. Hence, the contrary-to-duty paradoxes 
are important benchmark examples of deontic logic, and deontic logics incapable 
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of dealing with them are considered insufhcient tools to analyze deontic reason- 
ing. In this paper we restrict ourselves to the area of preference-based deontic 
logic [12,23,19,10 j 15,17,4], i.e. deontic logics based on a preference relation in 
the semantics, because they have proven to be the most appropriate to solve the 
paradoxes. Still, preference-based deontic logic has the following three problems. 

Strong preference problem. Preferences for ai and «2 conflict for a\ A ->02 
and -lai A 02 . 

Contrary-to-duty problem. A contrary-to-duty obligation is an obligation 
that is only in force in a sub-ideal situation. For example, the obligation 
to apologize for a broken promise is only in force in the sub-ideal situation 
where the obligation to keep promises is violated. Reasoning structures like 
‘oi should be (done), but if -^a\ is (done) then 02 should be (done)’ must be 
formalized without running into the notorious contrary-to-duty paradoxes of 
deontic logic like Chisholm’s and Forrester’s paradoxes [6,9]. 

Dilemma problem. Most deontic logicians take the perspective that deon- 
tic logic formalizes the reasoning of an authority issuing norms, and such 
an authority does not intentionally create dilemmas. Consequently, dilem- 
mas should be inconsistent. However, some deontic logicians, most notably 
in computer science, try to model obligations in practical reasoning, and 
in daily life dilemmas exist (see e.g. [40,4]). Consequently, according to 
this alternative perspective dilemmas should be consistent. In this paper 
we follow the traditional and mainstream perspective. The three formulas 
Oaf\0^a, 0 {ai/\a 2 ) f\0^a\ and 0(aiAa2|/3i)A0(-'ai]/3i A/ 32 ) represent 
dilemmas and they should therefore be inconsistent. However, the formula 
0(a]/3i) A 0{-^a\P2) does not represent a dilemma and should be consistent. 

In this paper we propose Prohairetic (i.e. Preference-based) Deontic Logic (PDL), 
a logic of non-defeasible obligations in which dilemmas are inconsistent. The ba- 
sic idea of this logic is that an obligation ‘a should be (done) if /3 is (done)’ is 
true if (1) no A /3 state is as preferable as an a A /3 state and (2) the pre- 
ferred (3 states are a states. The first part of the definition is used to formalize 
contrary-to-duty reasoning, and the second part is used to make dilemmas in- 
consistent. PDL shares the intuitive semantics of preference-based deontic logics, 
and solves the strong preference problem without introducing additional seman- 
tic machinery like bi-ordering semantics or ceteris paribus preferences. Moreover, 
PDL shares the intuitive formalization of contrary-to-duty reasoning of dyadic 
deontic logic [12,23]. Finally, PDL solves the dilemma problem by making the 
right set of formulas inconsistent. 

This paper is organized as follows. We first discuss the strong preference 
problem (Section 2) and the contrary-to-duty and dilemma problems (Section 3). 
We then give an axiomatization of PDL in a modal preference logic (Section 4), 
and finally we reconsider the three problems in PDL (Section 5). 
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2 Obligations and preferences 

It has been suggested [20,19,10)15,17,4] that a unary deontic operator O might be 
defined in a logic of preference by Oa =iej ct ~^a. In the following discussion 
we assume possible worlds models with a deontic preference ordering on the 
worlds, i.e. Kripke models {W, <, V) which consist of a set of worlds W, a binary 
reflexive and transitive accessibility relation < and a valuation function V. The 
problem is how to lift the preferences between worlds to preferences between 
sets of worlds or propositions, written as a\ >- oti. It is well-known from the 
preference logic literature [41] that the preference a >- cannot be defined by 
the set of preferences of all a worlds to each -,a world, because two unrelated 
obligations like ‘be polite’ Op and ‘be helpful’ Oh would conflict when considering 
‘being polite and unhelpful’ p A and ‘being impolite and helpful’ -,p A h. 
Proof-theoretically, if a preference relation has left and right strengthening, 
then the two preferences p > ^p and h >~ derive {p A ~^h) >- (-,p A h) and 
{^p A h) A (p A The two derived preferences seem contradictory. 

The conflict can be resolved with additional information. For example, po- 
liteness may be less important than helpfulness, such that (p A ~^h) A (-,p A h) is 
less important than (-,p A h) > {p A ~^h). This relative importance of obligations 
can only be formalized in a defeasible deontic logic, in which obligations can be 
overridden by other obligations (for an analysis of the conceptual distinctions, 
see [33,35]), because (pA^/i) A {^pAh) is overridden by {^pAh) >- (pA^/i), and 
Op is not in force when only (pA—^h) V (-,p A /i) worlds are considered. However, 
in this paper we only consider non-defeasible or non-overridable obligations. For 
such logics, the following three solutions have been considered. 

Bi-ordering. Jackson [19] and Goble [10] introduce a second ordering represent- 
ing degrees of ‘closeness’ of worlds to solve the strong preference problem. 
They define the preference a >- by the set of preferences of the closest 
a worlds to the closest ~^a worlds. The underlying idea is that in certain 
contexts the way things are in some worlds can be ignored - perhaps they 
are too remote from the actual world, or outside an agent’s control. For ex- 
ample, the obligations Op and Oh are consistent when ‘polite and unhelpful’ 
p A^h and ‘impolite and helpful’ ~^p Ah are not among the closest p, -,p, h 
and -,/i worlds. ^ 



^ This solution of the strong preference problem introduces an irrelevance problem, 
because the preferences no longer have left and right strengthening. For example, 
the preference (p Ah) {->p A h) cannot even be derived from ‘be polite’ p >- ->p, 
because pAhor -,pA h may not be among the closest p or -ip worlds. There is another 
interpretation of the closeness ordering. ‘The closest’ could also be interpreted as ‘the 
most normal’ as used in the preferential semantics of logics of defeasible reasoning. 
The ‘multi preference’ semantics is a formalization of defeasible AeonXic logic [33,35), 
sometimes called the logic of prima facie obligations. However, it is not clear that 
closeness is an intuitive concept for non-defeasible obligations. 
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Ceteris paribus. Hansson [15,14,13] defines a >- ~^a by a ‘ceteris paribus’ pref- 
erence of a to -la, see [4] for a discussion.^ That is, for each pair of a and 
-icr worlds that are identical except for the evaluation of a, the a world is 
preferred to the ~^a world. The obligation ‘be polite’ Op prefers ‘polite and 
helpful’ p A h to ‘impolite and helpful’ -ip A h, and ‘polite and unhelpful’ 
p A to ‘impolite and unhelpful’ ^p A ~^h, but it does not say anything 
about p A h and ~^p A ~^h, and neither about p A and A h. Likewise, 
the obligation ‘be helpful’ Oh prefers ‘polite and helpful’ p A h to ‘polite 
and unhelpful’ p A ^h, and ‘impolite and helpful’ ~^p A h to ‘impolite and 
unhelpful’ ~^pA^h. These preferences can be combined in a single preference 
ordering for Op A Oh that prefers p A h worlds to all other worlds, and that 
prefers all worlds to ^p A ~^h worlds.^ 

Consistent dilemmas. Finally, a preference a >- can be defined by ‘every 
-<a world is not as preferable as any a world’ (or, maybe more intuitively, 
a )f -la is defined as ‘there is an -lO world Wi which is at least as preferable 
as an a world W 2 ’)- The definition is equivalent to the problematic ‘all a 
worlds are preferred to each -<a world’ if the underlying preference ordering 
on worlds < is strongly connected, i.e. if for each pair of worlds Wi and W 2 in 
a model M we have either tci < W 2 or W 2 < vj\. However, the two obligations 
Op and Oh do not conflict when considering pA ~^h and ->pA h when we allow 
for incomparable worlds, following [40] . In contrast to the other solutions of 
the strong preference problem, dilemmas like Op A O^p are consistent. We 
say that the logic does not have the no-dilemma assumption. The preference 
relation has left and right strengthening, and p > ^p and h A ~^h imply 
{p A ~^h) A (-'p A h) and {^p Ah) >- {p A -•h). However, the latter two 
preferences are not logically inconsistent. The ^p Ah and p A ^h worlds are 
incomparable . ^ 

Prohairetic Deontic Logic (PDL) proposed in this paper is an extension of the 

third approach. To formalize the no-dilemma assumption, we write M \= la for 

^ Moreover, Hansson defines obligations by the property of negativity. According to 
this principle, what is worse than something wrong is itself wrong. See [15,4] for a 
discussion on this assumption. 

® In [16] Hansson rejects the use of ceteris paribus preferences for obligations (in con- 
trast to, for example, desires). Moreover, ceteris paribus preferences introduce an 
independence problem. At first sight, it seems that a ‘ceteris paribus’ preference 
a >- ->a is a set of preferences of all a A /3 worlds to each -la A p world for all cir- 
cumstances p such that a A p and -la A p are complete descriptions (represented by 
worlds). However, consider the preference p >- ~tp and circumstances p ~th. The 
preference p >- -tp would derive (p A (p ~'h)) A {~<pA (p ^ ~'h)), which is logically 
equivalent to the problematic {pA~th) A {-tpAh). The exclusion of circumstances like 
p -ih is the independence problem. Only for ‘independent’ p there is a preference 
ot a A p over -la A p (see e.g. [29] for an ad hoc solution of the problem). 

^ It was already argued by von Wright [41] that this latter property is highly im- 
plausible for preferences. On the other hand, this solution is simpler than the first 
two solutions of the strong preference problem, because it does not use additional 
semantic machinery such as the second ordering or the ceteris paribus preferences. 
Moreover, it does not have an irrelevance or an independence problem. 




Prohairetic Deontic Logic (PDL) 



81 



‘the ideal worlds in M satisfy a.’ Hence, if we ignore infinite descending chains® 
then we can define that M \= la if and only if Pref C| a | where Pref stands 
for the set of most preferred (ideal) worlds of M, and | a \ stands for the set 
of all worlds satisfying a. Obligations are defined as a combination of a strong 
preference and an ideal preference. 



Oa =def {a >- ^a) A la 

The formula Op A O^p is inconsistent, because the formula Ip A I—^p is inconsis- 
tent. The two obligations ‘be polite’ Op and ‘be helpful’ Oh are formalized by 
(1) p worlds are preferred to or incomparable with -ip worlds, (2) h worlds are 
preferred to or incomparable with —•h worlds, and (3) the ideal worlds are p A h 
worlds. 

In the following section we argue that this solution is not only simpler than 
the first two solutions of the strong preference problem, but it also gives a more 
intuitive solution to the contrary-to-duty problem. 



3 Dyadic obligations and contrary-to-duty preferences 

The contrary-to-duty problem is the major problem of monadic deontic logic, 
as shown by the notorious Good Samaritan [2], Chisholm [6] and Forrester [9] 
paradoxes. The formalization of these paradoxes should be consistent. For ex- 
ample, the formalization of the Forrester paradox in monadic deontic logic is 
‘Smith should not kill Jones’ (O-^k), ‘if Smith kills Jones, then he should do it 
gently’ {k Og) and ‘Smith kills Jones’ (fc). From the three formulas O^kAOg 
can be derived. The derived formula should be consistent, even if we have ‘gentle 
killing implies killing,’ i.e. h p' — > fc, see e.g. [11]. However, this formalization of 
the Forrester paradox does not do justice to the fact that only in very few cases 
we seem to have that O^a A 0{a A (I) is not a dilemma, and should be consis- 
tent. The consistency of O^kA Og is a solution that seems like overkill. Deontic 
logicians therefore tried to formalize contrary-to-duty reasoning by introducing 
temporal and preferential notions [37]. 

B. Hansson [12] and Lewis [23] argued that the contrary-to-duty problem 
can be solved by introducing dyadic obligations. A dyadic obligation 0(a|/3) is 
read as ‘a ought to be (done) if is (done).’ They define a dyadic obligation by 
Ohl{c(\I 3) =def d(a|/9), where we write I{a\l3) for ‘the ideal fl worlds satisfy a.’ 
Hence, if we again ignore infinite descending chains, then we define M \= I{a\f3) 
if and only if Prej{f3) C|a|, where Prej{f3) stands for the preferred f3 worlds of M. 
The introduction of the dyadic representation was inspired by the standard way 
of representing conditional probability, that is, by Pr{a\(3) which stands for ‘the 
probability that a is the case given /?.’ In a dyadic deontic logic the Forrester 

® The problems caused by infinite descending chains are illustrated by the following 
example. Assume a model that consists of one infinite descending chain of ->a worlds. 
It seems obvious that the model should not satisfy la. However, the most preferred 
worlds (which do not exist!) satisfy a. See [22,3] for a discussion. 
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paradox can be formalized by ‘Smith should not kill Jones’ 0(^fc|T), ‘if Smith 
kills Jones, then he should do it gently’ 0{g\k) and ‘Smith kills Jones’ k. In this 
formalization T stands for any tautology, e.g. p V ^p. The obligation 0{g\k) is 
a contrary-to-duty (CTD) obligation of 0{^k\T), because an obligation 0(a|/3) 
is a CTD obligation of the primary obligation 0(ai|/3i) if and only if ai A /? is 
inconsistent. In dyadic deontic logic, the formula 0(-'fc|T) A 0{g\k) is consistent, 
whereas the formula 0(-'/c|T) A 0(g|T) is inconsistent when we have g ^ k. 

The Hansson-Lewis dyadic deontic logics have been criticized (see e.g. [24]), 
because they do not have factual detachment, represented by the formula FD: 
0{a\p) A (3 ^ Oa, i.e. the derivation of absolute obligations from dyadic ones. 
However, there are good reasons not to accept FD. If the logic would have 
FD, then they would reinstate the Forrester paradox, because we would derive 
O^k A Og from 0{^k\T) A 0{g\k) A k. To explicate the difference with dyadic 
obligations which do have factual detachment and therefore cannot represent the 
Forrester paradox, we prefer to call the Hansson-Lewis obligations contextual 
obligations, see also [34]. Instead of FD we may have 0(a|/3 A 0{a\(3)) as a 
theorem, see [43]. 

Dyadic obligations formalize contrary-to-duty reasoning, without making 
dilemmas like Oa\ A 0(-'ai A 02 ) consistent. However, the dyadic represen- 
tation also introduces a new instance of the dilemma problem, represented by 
the formula 0{a \ (3{) A 0{^a \ A [32)- An example is Prakken and Sergot’s 
considerate assassin example, that consists of the two obligations ‘Smith should 
not offer Jones a cigarette’ 0(-ic|T) and ‘Smith should offer Jones a cigarette, 
if he kills him’ 0{c\k). Prakken and Sergot [26] argue that the two sentences 
of the considerate assassin example represent a dilemma, because the obligation 
0{c\k) is not a CTD obligation of 0{~^c\T). Hence, 0(-ic|T) A 0{c\k) should be 
inconsistent, even when there is another premise ‘Smith should not kill Jones’ 
0{-^k\T). 

B. Hansson-Lewis dyadic deontic logics do not give a satisfactory solution for 
the dilemma problem, because OHL{~^c\T)AOHL{c\k) is consistent. In Prohairetic 
Deontic Logic, dyadic obligations are defined in a similar spirit as the absolute 
obligations in the previous section. 

0{a\(3) ((a A(3)y A (3)) A I{a\(3) 

The set of obligations S = {0(-ic | T), 0(c | /c)} is inconsistent, because the 
formula (-ic c) A I{c\k) is inconsistent, as is shown in Section 5. 

4 Axiomatization 

Prohairetic Deontic Logic (PDL) is defined in a modal preference logic. The 
standard Kripke models M = {W, <,V) of PDL contain a binary accessibility 
relation <, that is interpreted as a (refiexive and transitive) deontic preference 
ordering. The advantages of our formalization in a modal framework are twofold. 
First, if a dyadic operator is given by a definition in an underlying logic, then 
we get an axiomatization for free! We do not have to look for a sound and 




Prohairetic Deontic Logic (PDL) 



83 



complete set of inference rules and axiom schemata, because we simply take 
the axiomatization of the underlying logic together with the new definition. In 
other words, the problem of finding a sound and complete axiomatization is 
replaced by the problem of finding a definition of a dyadic obligation in terms of 
a monadic modal preference logic. The second advantage of a modal framework 
in which all operators are defined, is that and a\ >- 02 can be defined 

separately. In this section we axiomatize PDL in the following three steps in 
terms of a monadic modal logic and a deontic betterness relation. See [10,14] 
for an analogous stepwise construction of ‘good’ in terms of ‘better’ and [43] for 
a stepwise construction of minimizing conditionals analogous to Ohl{c( \ ( 3 ) in 
terms of a ‘betterness’ relation. 

Ideality (deontic preference) ordering. We start with two monadic modal 
operators □ and □. The formula can be read as ‘a is true in all worlds 
at least as good (as the actual world)’ or ‘-io is necessarily worse,’ and □ a 
can be read as ‘a is true in all worlds.’ 

M,wj^ Oa iff Vtc' G W if w' < w, then M,w' \= a 

M,w ^ □ a iff ffw' gW M,w' \= a 

The □ operator will be treated as an S4 modality and the □ operator as 
an S5 modality. As is well-known, the standard system S4 is characterized 
by a partial pre-ordering: the axiom T: G\a — > a characterizes reflexivity 
and the axiom 4 : Da □□a characterizes transitivity [18,5]. Moreover, the 
standard system S5 is characterized by S4 plus the axiom 5:^ □ a ^ □ a. 
The relation between the modal operators is given by □ a □«. This 
is analogous to the well known relation between the modal operators for 
knowledge K and belief B given by Kp Bp. 

Deontic betterness relation. A binary betterness relation ai >- 02, to be 
read as ‘oi is deontically preferred to (better than) 02,’ is defined in terms of 
the monadic operators. The following betterness relation obeys von Wright’s 
expansion principle [41], because a preference of Oi over 02 only compares 
the two formulas Oi A ->02 and ~^ai A 02. 

ai >- «2 =def (cKi A -'02 □^(02 A -’Oi)) 

We have M,w \= ai a2 if we have W2 ^ Wi for all worlds Wi,W2 G W such 
that M, Wi ^ oi and M, W2 H ^(2, where we write as usual Oa =iej 
The betterness relation A is quite weak. For example, it is not anti-symmetric 
(i.e. -'(02 Oi) cannot be derived from Oi 02) and it is not transitive 
(i.e. Oi 03 cannot be derived from Oi 02 and 02 03). It is easily 

checked that the lack of these properties is the result of the fact that we do 
not have totally connected orderings. 
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Obligatory. What is obligatory is defined in terms of deontic betterness. 

>a))) 



I{a\(3) =,,,n(/3^0(/3AD(/3- , 

0{a\(3) (a A -la A /?) A J(a|, 



We have M,w ^ I{a\(3) if the preferred /3 worlds are a worlds, and a 
eventually becomes true in all infinite descending chains of (3 worlds [21,3]. 
Finally, we have M,w \= 0{a\(3) if we have W 2 ^ rci for all W\,W 2 & W such 
that M,w\ \= a /\ (3 and M, W 2 \= A (3, and M,w \= I{a\(3). 

The logic PDL is defined by defining these three layers in a modal preference 
logic. 

Definition 1 (PDL). The bimodal language C is formed from a denumerable 
set of propositional variables together with the connectives and the two 

normal modal connectives □ and □ . Dual ‘possibility’ connectives O and O are 
defined as usual by Oa and Oa □ ~^a. 

The logic PDL is the smallest S C C such that S contains classical logic and the 
following axiom schemata, and is closed under the following rules of inference. 



{Da^Df3) 
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□ (a - 


p) (Da - 


^ op) K' 


□ (a- 
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□ a — 


^ a 
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□ a — 




4’ 


Oa-^ 
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□ a - 
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□ 



Nes From a infer □ a 
MP From a ^ (3 and a infer (3 

extended with the following three definitions. 

Or 02 =def {cti A ->02 n^(a2 A “'Oi)) 

I{a\f3) =,,,n(/3^0(/3AD(/3^a))) 

0{a\p) =def {a A P y A (3) A I{a\P) 

Definition 2 (PDL Semantics). Kripke models M = (W, <, V) for PDL con- 
sist of IT, a set of worlds, <, a binary transitive and reflexive accessibility rela- 
tion, and V, a valuation of the propositional atoms in the worlds. The partial 
pre-ordering < expresses preferences: Wi < ui 2 iff tci is at least as preferable as 
W 2 - The modal connective □ refers to accessible worlds and the modal connective 
□ to all worlds. 

M,w Oa iff Vtc' gW if w' <w, then M,w' \= a 
M,w ^ □ a iff Vw' G W M,w' \= a 

The following proposition shows that, as a consequence of the definition in a 
standard bimodal logic, the soundness and completeness of PDL are trivial. 
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Proposition 3 (Soundness and completeness of PDL). LethpoL and ^pdl 
stand for derivability and logical entailment in the logic PDL. We have \~pdl a 
if and only if \=pdl ct. 

Proof. Follows directly from standard modal soundness and completeness proofs, 
see e.g. [18,5,28]. 

We now consider several properties of the dyadic obligations. First, the logic 
PDL has the following theorem, which is valid for any preference-based deontic 
logic defined by 0{a\(d) =de/ a A (3 >- A (3 (for any betterness relation ;^) or 
by 0{a\(3) J(a|/3). 

R 0{a\(3i A P 2 ) ^ 0{a A (3i \(3i A /?2), 

Second, the logic PDL does not have closure under logical implication. This is 
a typical property of preference-based deontic logics. For example, the preference- 
based deontic logics discussed in [19,15,10,4,17] do not have closure under logical 
implication either. The following theorem Weakening of the Consequent WC is 
not valid in PDL. 

WC 0(ai|/3) ^ 0(ai VazI/?) 

The third property we consider is the following disjunction rule OR, related 
to Reasoning-By-Cases and Savage’s sure-thing principle. It is not valid either. 

OR {0{a\(3i) A 0{a\(32)) 0{a\(3i V (32) 

The fourth property we consider is so-called Restricted Strengthening of the 
Antecedent RSA, expressed by the following theorem of the logic. In can easily 
be shown that 0{a\Pi A P 2 ) can only be derived in PDL from 0{a\(3\) when we 
have I{a\(3\ A (32) as well. 

RSA {0{a\(3i) A I{a\(3\ A (32)) — > 0{a\(3\ A (32) 

We can add strengthening of the antecedent with the following notion of 
preferential entailment, that prefers maximally connected models. We say that 
a model is more connected if its binary relation contains more elements (in our 
terminology {(tci,tC 2 ), (tC 2 ,ici)} is therefore more connected than {(tci, tC 2 )}). 

Definition 4 (Preferential entailment). Let the two possible worlds models 
Ml = (W, <i,V) and M 2 = (W, < 2 , V) be two PDL models. M\ is at least as 
connected as M 2 , written as Mi C M 2 , iff for all Wi,W 2 ii Wi <2 W 2 , then 

<1 W 2 . Ml is more connected than M 2 , written as Mi C M 2 , iff Mi C M 2 
and M 2 % Ml. The formula (j) is preferentially entailed by T, written as T 
iff M ^ ^ for all maximally connected models M of T. 

The maximally connected models of a set of obligations are unique (for a 
given W and V) if the transitivity axiom 4: Da is omitted from the 
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axiomatization. The unique maximally connected model of the set of obliga- 
tions S = {0{ai\(3i) I 1 < t < n} has the accessibility relation {tci < W 2 \ 
there is no 0{a\(i) G S such that M, Wi ^ A f3 and M,ui 2 ^ a A /?}. How- 
ever, if axiom 4 is omitted, then ‘preferred’ in I{a\P) no longer has a natural 
meaning. Finally, if only full models are considered in the semantics, i.e. models 
that contain a world for each possible interpretation, then we can derive for ex- 
ample 0 ^0(p|T), because there cannot be a model with only p worlds. In 

the following section preferential entailment is illustrated by several examples. 

5 The three problems reconsidered 

In the introduction of this paper we mentioned three problems: the strong pref- 
erence problem, the contrary-to-duty problem and the dilemma problem. In this 
section we show how Prohairetic Deontic Logic solves the three problems. The 
strong preference problem is that preferences for a\ and 02 conflict for a\ A -102 
and -lai A 02 . The following example illustrates that the problem is solved by 
the dynamics of preferential entailment. It also illustrates why the logic is non- 
monotonic. 

Example 5 (Polite and helpful, continued). Consider the three sets of obligations 
S — 1), S' — {0(p|T)} and S" — {0(p|T), 0(/i|T)}. The three unique maximally 
connected models of S, S' and S" are represented in Figure 1. With no premises. 




all worlds are equally ideal. By addition of the premise 0{p\T), the p worlds are 
strictly preferred over ~^p worlds. Moreover, by addition of the second premise 
0{h\T), the h worlds are strictly preferred over -i/i worlds, and the p A and 
^p Ah worlds become incomparable. Hence, the strong preference problem is 
solved by representing conflicts with incomparable worlds. This solution uses 
preferential entailment, a technique from non-monotonic reasoning, because for 
the preferred models we have that all incomparable worlds refer to some con- 
flict. We have S' 0{p\^{p A h)) and S" Y=\i 0{p\^{p A h)). By addition of a 
formula we loose conclusions. 
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The solution of the contrary-to-duty problem is based on the dyadic rep- 
resentation. The solution is illustrated by the representation of Forrester’s [9] 
and Chisholm’s [6] paradoxes in Prohairetic Deontic Logic. The paradoxes were 
originally formulated in monadic deontic logic. 

Example 6 (Forrester’s paradox). Consider S = {0{^k\T),0{g\k),k}, where 
k can be read as ‘Smith kills Jones’ and g as ‘Smith kills him gently,’ and g 
logically implies k. The unique maximally connected model of S is represented 
in Figure 2. The actual world is any of the k worlds. The formalization of S is 
unproblematic and the semantics reflect the three states that seem to be implied 
by the paradox. We have S \= 0{^k V (/[T) as a consequence of the theorem 
{0{a\^(d) A 0(/3|7)) ^ 0{a V /3|7), which expresses that k A ^g is the worst 
state that should be avoided. 



ideal situation 



sub-ideal situations 




Fig. 2. Unique maximally connected model of {0(-'fc|T), 0{g\k), k} 



Example 7 (Chisholm’s paradox). Consider S = {0(a|T), 0{t\a), 0{^t\^a), -> 0 }, 
where a can be read as ‘a certain man going to the assistance of his neighbors’ 
and t as ‘telling the neighbors that he will come.’ The unique maximally con- 
nected model of S is represented in Figure 3. The crucial question of Chisholm’s 



., ... .. ordered sub-ideal situations 

ideal situation 




Fig. 3. Unique maximally connected model of {0(o|T), 0{t\a), 0{^t\^a),^a} 



paradox is whether there is an obligation that ‘the man should tell the neighbors 
that he will come’ 0{t\T). This obligation is counterintuitive, given that a is 
false. This obligation is derived from S by any deontic logic that has so-called de- 
ontic detachment (also called deontic transitivity), represented by the following 
formula DD°. 



DD° {0{a\(3) A 0(/3|7)) -> 0(a|7) 
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However, DD*^ is not valid in Prohairetic Deontic Logic. The obligation ‘the 
man should tell his neighbors that he will come’ 0{t\T) cannot be derived in 
PDL from S. The following weaker version of DD° is valid in PDL. 

RDD (0(a|/3 A 7) A 0{f3\j) A J(a A P\^)) 0{a A j3\^) 

The obligation that ‘the man should go to his neighbors and tell his neighbors 
that he will come’ 0{a A t|T) can (preferentially) be derived from S. 

Finally, we show that Prohairetic Deontic Logic solves the dilemma problem, 
because it makes the considerate assassin set in Example 8 inconsistent, without 
making the window set in Example 9 inconsistent. 

Example 8 (Considerate assassin). Consider S = {0(-'C | T), 0(c | /c)}, where c 
can be read as ‘Smith offers Jones a cigarette’ and k can be read as ‘Smith kills 
Jones.’ The set S is inconsistent with O {k A -ic), as can be verified as follows. 
Assume there is a model of S. The obligation 0{c \ k) implies /(c| k), which 
means that for every world such that M, \= A k there is a world W 2 
such that M,u )2 \= cAk and W 2 < Wi (i.e. UI 2 < Wi and wi ^ ui 2 ). However, the 
obligation 0(-ic|T) implies -ic c, which means that for all worlds Wi such that 
M,wi \= A k there is not a world 102 such that M,W 2 \= c A k and UI 2 < Wi. 
These two conditions are contradictory (if there is such a world Wi). 

Moreover, consider S' = {0(^djT),0(dA pld),0(^pjT)}, where d can be 
read as ‘there is a dog’ and p as ‘there is a poodle.’ Prakken and Sergot [27] 
argue that S' should be inconsistent, based on its analogy with S. For similar 
reasons as the inconsistency of S above, the set S' is inconsistent in PDL. 

Example 9 (Window). Consider S = {0(c | r), 0(-ic | s)}, where c can be read 
as ‘the window is closed,’ r as ‘it starts raining’ and s as ‘the sun is shining.’ 
It is argued by von Wright [42] that S does not represent a dilemma and that 
it should therefore be consistent, see also [1]. In PDL the set S is consistent, 
and a maximally connected model M of S is given in Figure 4. The ideal worlds 
satisfy r ^ c and s ~^c, and the sub-ideal worlds either -ic A r or c A s. We 
have M [A 0(c|r A s) and thus S [A,- 0(c|r A s). 



ideal situation 




sub-ideal situations 




Fig. 4. Preferred model of {0(c|r), 0(-ic|s)} 



Note that there are many maximally connected models. For example, the cAr 
worlds can be the only preferred worlds, when the -ic A s worlds are equivalent 
with the ->c A r worlds. Alternatively, the ~^c A s worlds can be the only most 
preferred worlds. 
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6 Related research 

Besides the deontic logics already discussed in this paper, there is a preference- 
based deontic logic proposed by Brown, Mantha and Wakayama [4]. We write 
the obligations in their logic as Obmw- At first sight it seems that the logic is 
closely related to Prohairetic Deontic Logic, because ObmwC( also has a mixed 
representation. However, a further inspection of the definitions reveals that their 
proposal is quite different from ours. Obligations are defined by 

OBMwot =dei PfOi A AmOt =□ -la A Oa 

where Pja is read as ‘a is preferred,’ AmCx is read as ‘a is admissible,’ and □ a 
is read as ‘a is true in all inaccessible worlds.’ Hence, ObmwC( means ‘the truth 
of a takes us to a world at least as good as the current world and there exists a 
world at least as good as the current world where a is true’ [4, p.200]. The first 
distinction is that in the logic, dilemmas are consistent (which they consider an 
advantage, following [40] ) . Secondly, the motivation for the mixed representation 
is different. Whereas we introduced the mixed representation to solve both the 
contrary-to-duty problem (for which we use a y -<a) and the dilemma problem 
(for which we use la), they use the mixed representation to block the derivation 
of the theorem Obmw(o:i V 02) — > ObmwO:i, which they consider as ‘somewhat 
unreasonable, since the premise is weaker than the conclusion.’ However, it is 
easily checked that the logic validates Obmw(o:i V 02) A Amai Obmwo:i. 
Hence, under certain circumstances stronger obligations can be derived. This 
counterintuitive formula is not a theorem of PDL.® 

7 Conclusions 

In this paper, we introduced Prohairetic Deontic Logic. We showed that it gives 
a satisfactory solution to the strong preference problem, the contrary-to-duty 
problem and the dilemma problem. We now study the use of PDL for legal expert 
systems and to specify intelligent agents for the Internet [7,8] like the drafting, 
negotiation and processing of trade contracts in electronic commerce, and the 
relevance for logics of desires and goals as these are developed in qualitative 
decision theory [25,36]. Moreover, in [38] an implementation is envisaged for a 
sublanguage of PDL. 
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Abstract. In this paper we introduce phased labeled logics of condi- 
tional goals. Labels are used to impose restrictions on the proof theory 
of the logic. The restriction discussed in this paper is that a proof rule 
can be blocked in a derivation due to the fact that another proof rule 
has been applied earlier in the derivation. We call a set of proof rules 
that can be applied in any order a phase in the proof theory. We pro- 
pose a one-phase logic of goals containing four proof rules, and we show 
that it is equivalent to a four-phase logic of goals in which each phase 
contains exactly one proof rule. The proof theory of the four-phase logic 
of goals is much more efficient, because other orderings no longer have 
to be considered. 



1 Introduction 

In the usual approaches to planning in AI, a planning agent is provided with 
a description of some state of affairs, a goal state, and charged with the task 
of discovering (or performing) some sequence of actions to achieve that goal. 
Recently several logics for conditional or context-sensitive goals and desires have 
been proposed [3,2,10,1,12,11,8,7] in the context of qualitative decision theory. 
In [15] we introduced a version of a labeled deductive system [4] to reason about 
goals. Labeled goals G{a\(5)L can roughly be read as ‘preferably a if (i, against 
the background of L.’ The label keeps track of the context in which the goal 
is derived. It has some desirable properties not found in other proposals. First, 
the logic can reason about conflicting goals. This is important, because goals 
only impose partial preferences, i.e. preferences given some objective and given 
some context. Objectives can conflict and, as a consequence, goals with overlap- 
ping contexts can conflict. Second, the labeled logics are stronger than previous 
proposals in the sense that they validate strengthening of the antecedent and 
transitivity. It has been shown in [15] that these proof rules can only be combined 
with the desirable proof rule weakening of the consequent if additional machin- 
ery like labels is introduced in the logic. Otherwise counterintuitive conclusions 
follow. 

In the phased labeled logics of goals (pllg) introduced in this paper we show 
how to phase derivations. To impose phasing restrictions on the derivations, the 
labeled logics of goals are extended in two ways. First, a phase is associated 
with each PLLG-proof rule by an explicitly given phasing function. Second, the 
label of a PLLG-goal not only contains a set of sets of propositional formulas. 
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the fulfillments (F) of the premises from which the goal is derived, but it also 
contains an integer, the phase (p) in which the goal is derived. There are two 
restrictions on the use of proof rules in PLLG. First, as in other labeled logics 
of goals, a consistency check on F is used to restrict derivations such that all 
premises can be fulfilled. The fulfillment of a premise G{a\f3) is the propositional 
formula a A [19] and the fulfillment of a derived goal depends on the premises 
from which it is derived. Later in this paper we show how this consistency check 
restricts derivations to a kind of constructive proofs, in the sense that no goals 
are derived from conflicts. Second, and this is new, the phase p is used to select 
the proof rules that still may be applied. 

In this paper we focus on a one-phase labeled logic of goals (llg) and a four- 
phase labeled logic of goals (4llg). Both contain the four proof rules strength- 
ening of the antecedent, a type of transitivity, weakening of the consequent and 
the disjunction rule for the antecedent. Moreover, we show that the conjunction 
rule for the consequent follows from these four rules. In LLG these rules can be 
applied in any order, but in 4llg there is only one order in which they can 
be applied. Nevertheless, we prove that every formula that can be derived in 
LLG can also be derived in 4llg. Consequently, in the proof theory of LLG we 
can restrict ourselves to one specific order of the proof rules, which makes the 
logic much more efficient. Finally, we also prove that in 4llg the consistency 
check on the label can be replaced by a consistency check on the antecedent and 
consequent. 



2 Phased labeled logic of goals (pllg) 

Phased labeled logics of goals are versions of a labeled deductive system as it was 
introduced by Gabbay in [4]. Roughly speaking, the label T of a goal G{a\(3)L 
consists of a record of the fulfillments (F) of the premises that are used in the 
derivation of G{a\f3), and the phase (p) in which it is derived. Where there is no 
application of reasoning by cases, F can be taken to be a set of boolean formulas, 
that grows by joining sets as premises are combined. But in general, to cover 
the parallel tracks created through reasoning by cases, we need to consider sets 
of sets of boolean formulas [9] . 

Definition 1 (Language). Let £ be a propositional base logic. The language 
of PLLG consists of the labeled dyadic goals G(a|/3)L, with a and (3 sentences of 
C, and L a pair (F, p) that consists of a set of sets of sentences of L (fulfillments) 
and an integer (the phase). We write \= for entailment in C. 

Each formula G{a\f3)L occurring as a premise has a label that consists of its 
own (propositionally consistent) fulfillment and phase 0. 

Definition 2 (Premise). A formula G(a|/3)({{a/\^}}_o)) where a A /? is consis- 
tent in £, is called a premise of PLLG. 
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The phase of a goal is determined by the proof rule used to derive the goal, 
and the set of fulfillments is the union (or) or the product (SA, TRANS) of the 
labels of the premises used in this inference rule, where the product is defined by 
{^ 1 , . . . , X {Ti, . . . , T^} = {^1 U Ti, . . . , U T^, . . . , U T™}. The labels 
are used to check that fulfillments are consistent and that the phase of reasoning 
is non-decreasing. In a normative context the consistency check realizes a variant 
of the Kantian principle that ^ ought implies can.’ 

Definition 3 (pllg). Let p be a phasing function that associates with each 
proof rule below an integer called its phase. The phased labeled logic of goals 
PLLG for p consists of the inference rules below, extended with the following two 
conditions R — Rp + Rp- 

Rp- G(a|/3)(ir^p) may only be derived if each Fi G F is consistent: it must always 
be possible to fulfill a derived goal and each of the goals it is derived from, 
though not necessarily all of them at the same time. 

Rpi G(a|/3)(ir,p) may only be derived if p > ft for all goals G{ai\Pi)(^p.^p.) it is 
derived from. 

The inference rules of PLLG are replacements by logical equivalents (for an- 
tecedent and consequent) and the following four rules. 

. G{a \ /3i)(^p^p), R 

G{a I fii A /?2)(fx{/32}.p(SA)) 

I /3A7)(ft.pi),G(/3 I -f)(P,,p,),R 
G(a A fJ I 7j(FixF2.p(TRANS)) 

wcf: G{a^U3\p,ppR 

G(ai V 02 I /3 )(f,p(WC)) 

. G(g I /3i)(F,,p^),G(g | /32)(F2 ,p2)» 

^ G(g I /?i V /32 )(FiuF2,p(OR)) 

We say {G{ai\l3i) | 1 < f < n} G(a|/3) if there is a labeled goal G(g |/3 )f 
that can be derived from the set of goals {G(gi|/3i)({{„.A/3i}},o) I 1 < * < n}. 

The unusual transitivity rule (tranSf) implies, under certain circumstances, 
the standard transitivity rule as well as the conjunction rule. First, if we have 
p(sa) < p(trans) < p(wc), then we have the following derivation. 

G'(g|/3)(Fi.pi) 

1 ^ 

G(g|/3A7)(F,x{7},p(SA)) G(/3|7)(F,,p,) 

G(g A /3|7 )(Fix{7}xF2,p(trans)) 
wc 

G'(g|7)(Fix{7}xF2.p(WC)) 



TRANS 
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Consequently, the following standard transitivity rule is implied by PLLG if we 
have p(sa) = p(TRANS) = p(wc), i.e. if SA, TRANS and WC are in the same 
phase, because F 2 implies 7 (see also Proposition 10). 

G{a I /3)(Fi,pi)) G(/3 | 7)(F2,p2)> 

IRANS^ . j 7 

G{a \ 7)(FixF2.p(trans)) 

Second, concerning the conjunction rule, if /j(sa) < /j(trans), then we can 
first strengthen G(ai|/3) to G(ai|/3 A 02), and then apply TRANS as follows to 
derive G(ai A a2|/3). 

G(q:i|/3)(Fi,pi) 

1 1 

G(ai|/3 A a2)(Fix{a2}.p(SA)) G(a2|/3)(F2.P2) 

— — TRANS 

G(ai A a2|p)(Fix{a2}xF2,p(TRANS)) 



Consequently, the following conjunction rule is implied by the logic PLLG if we 
have p(sa) = p(TRANS), i.e. if SA and TRANS are in the same phase, because F 2 
implies 02 (Proposition 10). 



..TT. ^(“1 I /5)(Fi.pi), G(«2 I /3)(F2.P2)> 

ANDfl : j— TT 

G(oi A a2 I P)(FixF 2 ,p(trans)) 



Labeled logics of goals can reason about conflicting goals, and they can com- 
bine several proof rules without deriving counterintuitive consequences. It has 
been shown in [15] that this can be achieved by using the fulfillments without 
using the phases. First, the logics can reason about conflicting goals, because we 
have G(p), G(^p) G(pA^p) and G(p), G(^p) G{q), where G(a) is short 

for G(a|T), T stands for any tautology like p V ~^p and q is not logically implied 
by p or -ip. In particular, the fulfillments in the labels are used to block the 
second derivation step in the following counterintuitive derivation [15], in which 
a blocked derivation step is represented by a dashed line. Moreover, from the 
results presented later in this paper follows that this counterintuitive derivation 
can also be blocked by giving WC a higher phase than AND. 



G{p) 

G{p V q) G{^p) 



G{q A ^p) 

G{q) 



WC 



AND 



Second, it is easily checked how the fulfillments in the labels are used to 
combine strengthening of the antecedent (sa) with weakening of the consequent 
(wc) without validating the following counterintuitive derivation [15]. Moreover, 
this counterintuitive derivation can also be blocked by giving WC a higher phase 
than SA, together with a consistency check on the conjunction of antecedent and 
consequent. 



G{c\T) 
G(c Vt|T) 



wc 



SA 



G(c V t\^c) 
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3 Labeled logic of goals (llg): some examples 



In the this section we illustrate the use of the consistency check on F in the 
labeled logics of goals by a variant of the logic proposed in [15], and in the 
following sections we study the use of different phases in the proof theory. The 
logic LLG is the PLLG that consists of only one phase. 



Definition 4 (llg). The logic LLG is the PLLG with the phasing function p 
defined by p(SA) = 1, p(TRANS) = 1, p(wc) = 1, p(OR) = 1. 



The logic LLG derives the proof rules TRANS' and AND discussed in the previ- 
ous section, because SA, trans and WC (respectively SA and TRANS) are in the 
same phase. The following example illustrates the labeled logic of conditional 
goals. 

Example 5. Consider the set S = {G(a V p|T), G'(-'ajT)} as premise set, where 
a can be read as ‘buying apples’ and p as ‘buying pears’ (taken from [13]). We 
have S G{p\a), as desired. Below it is shown how two derivations of the 
counterintuitive G{p\o)l are blocked. The non-derived goal is counterintuitive, 
because when a is true (its antecedent) then the first premise is fulfilled and the 
second is violated. This pattern holds irrespective of whether p is true or false. 
Buying pears does not ‘improve’ the situation, once apples are bought. Hence, 
once a is assumed, there is no longer any reason to infer p. 



G(a\/ p|T)({{avp}},o) G(^a|T)({{^„}}_0) 

^ PO ({ {a Vp, ^a} } , 1 ) 



G(^a A P|“)({{avp,^a,a}},l) 

({{aVp,^a,a}},l) 



G{a V p|T)({{gvp}}, 0 ) G(^a|T)({{^G}}_ 0 ) 

e\ pT) f ^ Vp, ^a} } , 1 ) 
WC 

^(pO ({{pVp.^a}},!) 



SA 

^(pb) ({{aVp,^a,a}},l) 



AND 



The following example illustrates how the transitivity rule formalizes that 
conditional rules can be applied one after the other. Derivations go ‘as far as 
possible.’ 

Example 6. Consider the set of goals S = {G(a| &), G(&] c), G(c|T), G(-'a| T)}. 
There is a conflict for a, because we have S Ki,g G(a |T) and S [-^ 1,0 G(-'a |T). 
There is not a conflict for b, because we have S G(6|T) and S I/^gg G(^6|T). 
Hence, derivation chains go as far as possible and there is no weak contra- 
position as, for example, in conditional entailment [5], see also the discussion 
in [9]. Moreover, consider the set of goals S' = {G(a|6), G(-ic|6V c)}. We have 
S b^GG G(a|6 V c), as desired. The following derivation illustrates how this more 
complex form of transitivity is supported. 

G'(a|&)({{aAb}}.0) 

SA. 

G{a\bA -'c)({{aAb,^c}}.i) G(-ic|6 V c)({{{,a^c}},o) 

— ^ TRANS 

G(a A -ic|6 V C)({{aAb,^c,6A^c}},l) 

G(tt|6V c)(^||a/^^^^c,bA^c}},l) 

The third example illustrates how the disjunction rule supports reasoning by 



cases. 
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Example 7. Consider the set S = {G(aAc|6), G(aA-'c|^6)} (taken from [9]). We 
have S I-^lg G(a |T), as desired. The following derivation illustrates this complex 
type of reasoning by cases. 

G(a A c|6)({{aAcAb}},o) G{a A -'c|^6)({{aA^cA^b}},0) 
wc wc 

G'(a|&)({{aAcAb}},l) G(a|^6)({{aA^cA^6}}.l) 

OR, 

G(fl|T)({{aAcAb},{aA^cA^b}}),l) 

On the other hand, consider the set S = {G(a| &), G(a|^&)}. In [14,21] it is 
argued that G(a|a <-> b) is counterintuitive and should therefore not be derived. 
We have S I/^gg G(a|a b), and we have the blocked derivation below. The 
non-derived goal is counterintuitive, because when a ^ b is true (its antecedent) 
then the first premise is fulfilled when its antecedent is true and the second is 
violated when its antecedent is true. This pattern holds irrespective of whether 
a is true or false. Hence, once a & is assumed, there is no longer any reason 
to infer a. 

G'(a|6)({{aA6}},0) G(a|^6)({{aA^b}},0) 

OR 

G'(a|T)({{aAb}.{aA^b}},l) 

SA 

G(u|u aa 6)(^|aAb,a.G.,b},{aA^b,a.G.,b}},l) 

In the following section we study the additional expressive power of PLUG 
over LLG by introducing different phases in the proof theory. 

4 Four-phase labeled logic of goals (4llg) 

In this section we discuss a labeled deontic logic which completely orders the 
derivations of LLG in the following order: SA, TRANS, WC, OR. We call the result- 
ing logic four-phase labeled logic of goals 4llg. 

Definition 8 (4llg). The logic 4llg is the PLLG with the phasing function p 
defined by p(SA) = 1, p(TRANS) = 2, p(wc) = 3, p(OR) = 4. 

In Theorem 12 below we show that for each LLG derivation there is an equiv- 
alent 4llg derivation. We first prove three propositions. 

Proposition 9. Consider any potential derivation o/PLLG, satisfying the condi- 
tion Rp but not necessarily Rp- Then the following two conditions are equivalent: 

1. The final derivation satisfies condition Rp, 

2. The derivation satisfies Rp everywhere. 

Proof The labels are, in a suitable sense, cumulative. Every element of every 
label in the derivation is classically implied by some element of the label of the 
final conclusion. 
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Proposition 10 . For each goal G{a\l 3 )(^F,p) derived in PLLG we have for each 
Fi € F that Fi \= a A ( 3 . 

Proof By induction on the structure of the proof tree. The property trivially holds 
for the premises, and it is easily seen that the proof rules retain the property. 



Proposition 11 . We can replace two subsequent steps of an LLG derivation by 
an equivalent 4llg derivation. 

Proof The replacements are given below. For each replacement the original LLG 
derivation as well as its 4llg replacement are given. From Proposition 10 fol- 
lows that the replacement does not violate the consistency check. For example, 
consider the reversing e.l. 0/OR4 and TRANS2. Call the fulfillments of the three 
premises G(a|/ 3 i A7), G{a\(32 A^) and G{( 3 \ V/32I7) respectively F\, F2 and F3. 
From the LLG- derivation follows that each element of {F1UF2) xFs is consistent, 
and therefore all elements of Fi x F 3 and F2 x F3 are consistent. Moreover, from 
Proposition 10 follows for each F\^i e F\ that F\^i ^ /?i and for each F2,i € F2 
that F2,i H /®2- Consequently, for the Allg- derivation we have that the labels of 
the replacements Fi x {( 3 i V ^(32} x F 3 and F2 x {(32 V x F3 are equivalent 
to Fi X F 3 and F2 x F3, and all elements of them are therefore consistent. The 
other proofs are analogous. 



G(g|/3A7i) G(/3|7i) 

G{aA( 3 \ji) 



G{aA( 3 \ji A 72) 



TRANS 



G(a \(3 A 7i) 



SA 



G(/3|7i) 



G(g|/3A7i A72) G(/3|7 iA72) 

G(g A ( 3 \ji A 72) 



SA 

TRANS 



a. Reversing the order of TRANS2 and SAi 

G(gi|/3i) G(gi|/3i) 

WC Si 

G(gi V g2|/3i) G(gi|/3i A /?2) 

SA. 

G(gi V g2|/?i A P2) G(gi V ct 2 \( 3 \ A P2) 

b. Reversing the order of WC3 and SAi 



G{ai \(3 A 7) 

WC 

G(gi V a 2\(3 A 7) G{( 3 \^) 

G((gi V 02) A/3|7) 



G(gi|/3A7) G{( 3 \j) 
G(gi A ( 3 \j) 
G((gi V a2) A ( 3 \^) 



■ TRANS 
WC 



c.l. Reversing the order of WC 3 and TRANS 2 



G{( 3 i\i) 

G(g|(/3i V /?2) A 7) G(/3 iV/32 |7) 

G((g A (A V/?2)|7) 



WC 



G(g|(/3i V ( 32 ) A 7) 

Q A 

G{a\( 3 i At) G(/3i|7) 

G(g A /3 i|7) 



TRANS 



G((gA(/3iV/?2)|7) 



TRANS 



WC 
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C. 2 . Reversing the order of WC3 and TRANS2 



G(g|/ 3 i) G{a\(52) 
G{a\(3\ V /?2) 
G{a\{j3\ V /?2) A /?3) 



G(g|/ 3 i) G(a|/?2) 

Q A 

G(g|/3iA/33) G(g|/32A/33) 

G(g|(/ 3 i A /?3) V (/?2 A /?3)) 



d. Reversing the order of OR4 and SAi 



G(g|/3iA7) G(g|/?2A7) 

OR 

G(g|(/ 3 i V /?2) A 7) G(/ 3 i V /32I7) 

G(g A (/ 3 i V / 32 )| 7 ) 



TRANS 



G(/3i V/32I7) 



G(a|/ 3 i A 7) G(/ 3 i V /32I7 A (/ 3 i V ^/32)) 

G(a A (/ 3 i V / 32)|7 A (/ 3 i V ^/ 32 )) 



SA 

TRANS 



G(/3i V/32I7) 



G(a|/?2 A 7) G(/ 3 i V /?2|7 A (/?2 V ^/ 3 i)) 

G(a A (/ 3 i V / 32)|7 A (/?2 V ^/ 3 i)) 



G(a A (/ 3 i V / 32 )| 7 ) 



SA 

TRANS 



e.l. Reversing the order of OR4 and TRANS2 

G{(i\li) G(/3|72) 



G(g |/3 A (71 V 72)) G(/3|7iV72) 



OR 
TRANS 



G(g|/ 3 A (71 V 72)) 
G(g|/3A7i) 



SA 



G(g A /3|7 i V 72) 

G(g|/ 3 A (71 V72)) 



G(/ 3 | 7 i) 



G(g A /3|7 i) 



TRANS 



G(g|/3A72) 



SA 



G(/ 3|72 



G(g A /3I72) 



TRANS 



G(g A /3|7 i V 72) 

e.2. Reversing the order of OR4 and TRANS2 



OR 



G(gi|/ 3 i) G(gi|/?2) 
G{a\\j3i V /?2) 



OR 

wc 



G(gi|/ 3 i) G(gi|/32) 

wc — — T wc 



G(gi V g 2 |/ 3 i) 



G(gi V g 2 |/ 32 ) 



G(gi V g 2 |/ 3 i V /?2) G(gi V g 2 |/ 3 i V / 32 ) 

f. Reversing the order of OR4 and WC3 



OR 



Theorem 12 (Equivalence LLG and 4 llg). Let S be a set of conditional goals. 
We have S G(g|/ 3 ) if and only if S [-41,1,0 G(g|/ 3 ). 



Proof Every 4 llg derivation is a LLG derivation. We can take any LLG 
derivation and construct an equivalent 4 llg derivation, by iteratively replacing 
two subsequent steps in the wrong order by several steps in the right order, see 
Proposition 11. If the proof tree is finite, then after a finite number of steps, all 
derivation steps are ordered, because no set of replacements cycles ( and can be 
used to construct infinite proof trees). 
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Proposition 13. 4llg is the only four-phase logic of goals for which Theo- 
rem 12 holds. 

Proof Proposition 11 does not hold for any other four phase logic. Counterex- 
amples for reversing the order of each two subsequent steps of 4llg are given 
below. For SA and TRANS, the premises G{p\T) and G(< 7 |T) cannot be combined 
unless one is first strengthened. For TRANS and WC, if the premise G(g|T) is 
weakened then it can no longer be used to detach G{p\q). For WC and OR, the 
latter can only be applied if consequents are equivalent. 

G(p I T) G(p I q) G{q I T) G(P 1 I gi) 

SA TRANS WC 

G(p \ q) G(g I T) G(pAq\T) G(pi V P2 I gi) G(piVp 2 |g 2 ) 

; ; , TRANS ; ; ^ WC ; ; , OR 

G{p A g I T) G(p I T) G(pi V P2 I gi V g2) 



It is easy to see that the following proof rule OR' is implied by OR and WC 
in LLG. If we replace OR in LLG by the more general OR', then we can use it as 
a phase-3 rule. 

^ G(ai I fli)(^Fi,pi),G{a2 \ /32)(F2,p2)» 

^ G{ai V 02 I /?! V /32)(p’iuF2,p(OR)) 

Moreover, Theorem 14 shows that in 4llg the consistency check on the label 
can be replaced by a consistency check on the conjunction of the antecedent and 
consequent of the goal. 

Theorem 14. Consider any potential derivation o/4llg, satisfying the condi- 
tion Rp but not necessarily Rf- Then the following four conditions are equivalent: 

1. The derivation satisfies condition Rp throughout phase 1, 

2 . The derivation satisfies Rp everywhere, 

3. Each consequent is consistent with its antecedent throughout phase 1, 

4 . Each consequent is consistent with its antecedent everywhere. 

Proof Clearly (2) (1) and (4) (3). Through phase 1, for each formula 

the conjunction of antecedent and consequent is equivalent to the unique element 
of its label. Hence (1) (3). In phase 2 the conjunction of antecedent and 

consequent is also equivalent to the unique element of its label, which is equivalent 
to the label of the first premise of each derivation step. In phase 3 and 4 the 
rules preserve the consistency of the conjunction of antecedent and consequent, 
and they also preserve the property that each element of the label is consistent. 
From this we have (3) ^ (4) and (1) ^ (2). Putting this together gives us 
(1) (2) (3) (4) and we are done. 

In the following section we illustrate that the first two phases of the four- 
phase logic 4llg, i.e. SA and TRANS, can be combined in one phase without 
invalidating Theorem 14. Moreover, the latter two phases of 4llg, i.e. WC and 
OR, can be combined similarly. The two-phase logic 2llg first combines goals in 
derivation chains or arguments (SA and TRANS), and then combines arguments 
(wc and or) with reasoning by cases. 
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5 Two-phase labeled logic of goals 2llg 

The logic 2llg is the PLLG that combines the first two phases and the last two 
phases of 4llg. 

Definition 15 (2llg). 2llg is the PLLG with the phasing function p defined 
by p(sa) = 1, p(trans) = 1, p(wc) = 2, p(OR) = 2. 

It is easy to see that 2llg is equivalent to LLG and 4llg in the sense of 
Theorem 12, because its phasing is in between the phasing of LLG and 4llg. 
Moreover, Theorem 16 below is analogous to Theorem 14 for the two-phase 
logic 2 LLG. 

Theorem 16. Consider any potential derivation of 2llg, satisfying the condi- 
tion Rp but not necessarily Rf- Then the following four conditions are equivalent: 

1. The derivation satisfies condition Rp throughout phase 1, 

2. The derivation satisfies Rp everywhere, 

3. Each consequent is consistent with its antecedent throughout phase 1, 

4- Each consequent is consistent with its antecedent everywhere. 

Proof Analogous to the proof of Theorem 14, because in phase 1 the conjunc- 
tion of antecedent and consequent is also equivalent to the unique element of its 
label, and in phase 2 the rules preserve the consistency of the conjunction of 
antecedent and consequent, as well as the property that each element of the label 
is consistent. 

We can construct other two-phase logics of goals, for example the one in 
which the first phase consists of only SA, but the following proposition shows 
that in those logics we cannot restrict ourselves to a consistency check on the 
conjunction of the antecedent and conjunction. 

Proposition 17. The logic 2llg is the only two-phase PLLG that validates ver- 
sions of Theorem 12 and 14, in which 4llg is replaced by the two-phase logic. 

Proof From the proof of Proposition 13 follows that 2llg is the only two-phase 
PLLG for which Theorem 12 holds and in which WC cannot occur before TRANS. 
The following derivation shows that ifwc can occur before TRANS, then we need 
the labels for the consistency check. 



G{^p A g|r)({{^pAgAr}}.0) 

WC 

G{p\q A ?’)({{pAgAr}}.0) G'(9k)({{^pAgAr}},p(WC)) 

— — TRANS 

G(P A g|r)({{pAgAr,^pAgAr}},p(TRANS)) 

The surprising theorems follow from the consistency check in PLLG, which is 
stronger than it seems at first sight. In the following section we illustrate that 
the consistency check restricts derivations to a type of constructive proofs in the 
sense that no conclusions may be drawn from conflicts. 
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6 Conflicting goals 

In this section we discuss three derivations related to conflicting goals. They are 
not valid in PLLG, because their usefulness depends on the application area of the 
logic of goals. In natural language different types of goals are used through each 
other, with possible confusion. An agent can either commit himself to a goal, 
because it maximizes his expected utility, or the goal may be imposed upon him 
by some authority, e.g. his boss or owner. In the latter case, the agent can simply 
adopt the goal as a desirable state, or he may interpret it as a reflection of the 
authority’s maximal expected utility. Most formalisms for goals are developed to 
be used by intelligent robots, whose goals are the commands given by his owner. 
In the robot case, the logical properties of goals resemble the logical properties 
of obligations. As a consequence of the analogy between goals and obligations, 
the following unrestricted conjunction rule may be accepted in the logic of goals. 

. Gfai I /3)(f^,p^),G’(a2 | /?)(F2,p2) 

G(ai A 02 I 

xF2,max(pi,p2)) 

Moreover, the deontic ‘ought implies can’ axiom ^G(T|o) may be accepted. 
For a discussion on the unrestricted conjunction rule and the associated prob- 
lems, as well as a development of logics satisfying this proof rule, we refer to the 
deontic logic literature (see e.g. [14,20] for a discussion and references). However, 
even if the unrestricted conjunction rule and the deontic ‘ought implies can’ ax- 
iom are not accepted in the logic of goals, then there is another interesting issue 
of conflicting goals. It concerns the following derivation, which we call Forbidden 
Conflict (fc) [17]. In this rule, as well as in the derivations in the remainder of 
this section, we leave the labels unspecified. 

G(-a|T) G{a\b) 

In contrast to the constructive proofs of PLLG, this proof rule formalizes a conflict 
averse strategy. If b is the case then a conflict arises; therefore the agent desires 
that b is false. The following continuation of Example 6 illustrates that deriva- 
tions no longer go as far as possible, but instead we have weak contraposition. 



Example 18. Assume the proof rules TRANS, WC and FC together with the four 
goals S = {G(a|6), G(6|c), G(c|T), G(-'a|T)}. The goal G(6|T) can be derived 
from G(6|c) and G(c|T) by TRANS and WC, and the goal G(^6|T) from G{a\b) 
and G(-'a|T) by FC. Hence, there is a conflict for b, because G(6|T) as well as 
G(^6|T) can be derived. At first sight, this seems undesirable (see also [9]). 

The following derivation is the third and last one we discuss related to con- 
flicting goals. 

G(a|6) G(a|-6) G{a ^ b\T) 

G(a A 6|T) 
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It formalizes a more complex conflict averse strategy. From the two goals G{a\—'b) 
and G{a &|T) follows that if -•b is the case then a conflict arises: a as well as 
a ^ b cannot both be the case as well. The goal G(a A 6|T) can thus be derived 
as follows. 



It follows from Proposition 10 that in PLLG this derivation is blocked, because 
all consistent extensions of the fulfillment of the premise G(a|-'&) do not derive 
a A b. The following derivation illustrates that the goal can be derived in 4llg 
without fulfillment check with the unrestricted conjunction rule ANdA 



G(a|6) G(ah6) 

OR 

G(a|T) G{a^b\T) 

G(a A 6|T) 



AND 



The consequences of conflict averse strategies for PLLG are yet unclear, and 
left for further research. 



7 Related research 

Several authors have observed the relation between goals and desires in quali- 
tative decision theory and obligations in deontic logic [10,1,8] and we have dis- 
cussed the relation between decision theory, diagnosis theory and deontic logic 
in [19]. Labeled deontic logic was introduced in [16,17], though its properties 
have not been studied. It does not make a dilemma like 0{p\T) A 0(^p|T) 
inconsistent, in contrast to traditional deontic logics like for example so-called 
standard deontic logic (SDL), and we therefore now interpret it as a logic of 
goals. Makinson [9] extended labeled deontic logic with the disjunction rule to 
cover reasoning by cases and an unrestricted conjunction rule. Two-phase deon- 
tic logic has been proposed in [13], but phased deontic logic has not been related 
to labeled deontic logic (although it has been suggested in [9]). 

In [15] we introduced the one-phase labeled logic of goals LLGq based on 
labeled formulas G{a\(5)(^py) and two consistency checks. In LLGq, a formula 
G(a|/3)({aA/3},{^aA/3}) is Called a premise, and the label of a goal derived by an 
inference rule is the union of the labels of the premises used in this inference 
rule. LLGq based on a violation check and a fulfillment check consists of inference 
rules, extended with a condition R = Ry + Rf- 

Rv- G{a\p)(^F,v) may only be derived if a A /3 [A -y for all 7 G V: fulfilling a 
derived goal should not imply a violation of one of the goals it is derived 
from; 

Rf'- G{a\P)(^F,v) may only be derived if aA(i [A -,7 for all 7 G F: it must always 
be possible to fulfill a derived goal and each of the goals it is derived from, 
though not necessarily all of them at the same time. 
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This logic is stronger than the logics proposed in this paper, because it validates 
the following counterintuitive derivation. Notice that we cannot strengthen the 
condition of this logic, such that for example it is always possible that all the 
premises can be fulfilled at the same time, because then the intuitive first deriva- 
tion of Example 7 is blocked. 

G{p\r) G{q\r) 

Gipf\q\r) GiphqGr) 

OR 

G{p^q\-Y) 

wc 

G((pAg)Vs|T) 

SA. 

G{{p A g) V s|-'(p A g A r)) 



Alternatively, and perhaps more intuitively, we take the logic PLUG and re- 
place its SA rule by the following rule SA*, and replace Rp by Rp* with a 
consistency check on the conjunction of the fulfillments in the label and the 
antecedent of the goal, as in llgq. We call the resulting logics pllg*. 

. G{a I f3i)(p,p),R 
^ G{a I Pi A P2)(p,p(SA)) 

Rp*: G{a\P)(^p^p) may only be derived if {/3} U Fi is consistent for each Fi € F. 

It is easy to see that 4llg* is equivalent to 4llg. Moreover, we conjecture 
that Theorem 12 still holds, i.e. that LLG* is equivalent to 4llg*, and we conjec- 
ture that PLLG* is equivalent to PLLG. However, the following example illustrates 
that the proof of Theorem 12 no longer holds, because some LLG* derivations 
have intermediate results (here G{p\qi)) which cannot be used in 4llg*. 

G{p\qi) G{p\q 2 ) 

G(p|gi Vg2) 

G(phgiAg2) 



Makinson [9] proposes a one-phase labeled logic based on Rp* in which the 
premises are represented by G[a\p){{a}}, ke. in which only the consequents are 
represented in the label. Obviously, Theorem 12 no longer holds. Typical prop- 
erties are that the disjunction rule always holds, such that the second derivation 
in Example 7 is not blocked, and the last derivation of Section 6 is also valid. 
Moreover, the logic has been extended with an unrestricted conjunction rule. 

Most other logics of goals that have been proposed have a built in mechanism 
such that more specific and conflicting goals override more general ones, see 
e.g. [1,12,11]. However, specificity is only one possible rule to decide conflicts, 
which may be overridden itself (as in legal reasoning, where more general later 
rules override more specific earlier rules). Moreover, these logics make conflicts 
like G{p) A G{^p) inconsistent, which is not in line with the idea that goals 
can refer to different objectives. Finally, the non-monotonic mechanisms do not 
formalize the deliberating robber satisfactorily [18]. 
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8 Conclusions and further research 

Labeled logic is a powerful formalism to analyze and construct proofs. In this 
paper we discussed four proof rules in the framework of labeled deduction and 
we showed how labeled logics can combine the proof rules without deriving 
counterintuitive consequences. In particular, we showed how derivations can be 
partitioned into several phases. Surprisingly, the phasing of 4llg does not re- 
strict the set of derivable conclusions of LLG . Consequently, the phasing of 4llg 
can be considered as a useful heuristic to make the proof theory of LLG more 
efficient, because only a small subset of all proofs of LLG have to be considered 
to proof the (in)validity of a formula. 

Presently we are looking for ways to define a semantics for PLLG, such that 
also negations and disjunctions of goals are defined. One approach has been 
given in [9]. A possible worlds semantics can be defined along the lines of the 
two-phase deontic logic in [13,21]. Theorem 14 and 16 show that in 4llg and 
2 LLG we can get rid of the fulfillments in the label by checking the consistency 
of the conjunction of the antecedent and consequent of the dyadic operators. 
Moreover, we can also get rid of the integer in the label by introducing different 
operators for each phase. For example, for 2 LLG we can define the two operators 
G\ (with proof rules SA and TRANS) and G2 (with proof rules WC and OR). The 
premises are goals G\{ai \l3i), the conclusion is a goal G2(a|/3), and the two 
phases are linked to each other with the following new proof rule. 

Gijam 

G2{a\(3) 

To fully grasp the logical properties of goals, we think it is important to 
not only consider the proof theory, but also to consider the semantic relation 
between goals, desires, utilities and preferences in a decision-theoretic setting 
(see e.g. [10,8,22]). In particular, goals serve as computationally useful heuristic 
approximations of the relative preferences over the possible results of a plan [3] , 
and goals are used to communicate desires in a compact and efficient way [6]. 
We think this is not a rivaling approach to the approach taken in this paper, 
but a complementary one. 
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Abstract. We present a model of parallel search in theorem proving for 
forward-reasoning strategies, with contraction and distributed search. 
We extend to parallel search the bounded-search-spaces approach to the 
measurement of infinite search spaces, capturing both the advantages of 
parallelization, e.g., the subdivision of work, and its disadvantages, e.g., 
the cost of communication, in terms of search space. These tools are 
applied to compare the search space of a distributed-search contraction- 
based strategy with that of the corresponding sequential strategy. 



1 Introduction 

The difficulty of fully-automated theorem proving has led to investigate ways 
of enhancing theorem-proving strategies with parallelism. We distinguish among 
parallelism at the term level (i.e., parallelizing the inner algorithms of the strat- 
egy), parallelism at the clause level (i.e., parallel inferences within a single search) 
and parallelism at the search level or parallel search [8]. This paper considers par- 
allel search: deductive processes search in parallel the space of the problem, and 
the parallel search succeeds as soon as one of the processes succeeds. A parallel- 
search strategy may subdivide the search space among the processes {distributed 
search), or assign to the processes different search plans (multi-search), or com- 
bine both principles. The processes communicate to merge their results, and 
preserve completeness (e.g., if the search space is subdivided or unfair search 
plans are used). This paper concentrates on distributed search. 

Parallel search applies to theorem-proving strategies in general. This paper 
studies forward-reasoning and in particular contraction-based strategies (e.g., 
[13,19,9,4]). There has been much interest in parallelizing these strategies (e.g., 
[12,11,5,15,6] and [8,21] for earlier references), because they behave well sequen- 
tially (e.g., Otter [17], RRL [14], Reveal [2], EQP [18] are based on these strate- 
gies, and thanks to them succeeded in solving challenge problems, e.g., [2,1,18]). 
However, the parallelization of contraction-based strategies is difficult, primar- 
ily because of backward contraction: the clauses used as premises are subject to 
being deleted, the database is highly dynamic, and eager backward contraction 
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(inter-reduce before expanding) diminishes the degree of concurrency of the in- 
ferences. The qualitative analysis in [8] showed how these factors affect adversely 
approaches based on parallelism at the term or clause level. 

This paper begins a quantitative analysis of distributed- search contraction- 
based strategies, which poses several problems. We need to represent subdivision 
of the search space and communication, and measure their advantage and cost, 
respectively. Since the subdivision may not be effective, the search processes 
overlap: one needs to capture also the disadvantage of duplicated search. These 
problems are made worse by a fundamental difficulty: the search spaces in theo- 
rem proving are infinite. Therefore, we cannot analyze subdivision and overlap 
in terms of total size of the search space. Neither can we rely on the classi- 
cal complexity measure of time to capture the cost of communication, because 
theorem-proving strategies are semidecision procedures that may not halt, so 
that “time” is not defined. Finally, the problems are not well-defined. Since 
parallel theorem proving is a young field, and an analysis of this kind was not 
attempted before, there are no standard formal definitions for many of the con- 
cepts involved. In recent work [10], we proposed an approach to the analysis 
of strategies, comprising a model for the representation of search, a notion of 
complexity of search in infinite spaces, and measures of this complexity, termed 
bounded search spaces. In this paper we build on this previous work to address 
the problems listed above. 

The first part of the paper (Sections 2 and 3) develops a framework of 
definitions for parallel theorem proving by forward-reasoning distributed- 
search strategies. Three important properties of parallel search plans {mono- 
tonicity of the subdivision, fairness and eager contraction) are identified, and 
sufficient conditions to satisfy them are given. We point out that it is not ob- 
vious that a parallelization of a contraction-based strategy is contraction-based. 
On the contrary, this issue is critical in the parallelization of forward reasoning. 
A model of parallel search is presented in Section 4. A strategy with con- 
traction not only visits, but also modifies the search space [10]. In distributed 
search also subdivision and communication modify the search space, and many 
processes are active in parallel. Our solution is based on distinguishing the search 
space and the search process, and yet representing them together in a parallel 
marked search graph. The structure of the graph represents the search space 
of all the possible inferences, while the marking represents the search process, 
including contraction, subdivision and communication. 

Once we have a model of the search, we turn to measuring benefits and 
costs of parallelization in terms of search complexity (Section 5). The 
methodology of [10] is based on the observation that for infinite search spaces it 
is not sufficient to measure the generated search space. It is necessary to mea- 
sure also the effects of the actions of the strategy on the infinite space that lies 
ahead. An exemplary case is that of contraction, where the deletion of an existing 
clause may prevent the generation of others. Our approach is to enrich the search 
space with a notion of distance, and consider the bounded search space made of 
the clauses whose distance from the input is within a given bound. The infinite 
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search space is reduced to an infinite succession of finite bounded search spaces. 
Since the bounded search spaces are finite, they can be compared (in a multi- 
set ordering), eliminating the obstacle of the impossibility of comparing infinite 
spaces. The second fundamental property of the bounded search spaces is that 
they change with the steps performed by the strategy. In the sequential case, 
the only factor which modifies the bounded search spaces is contraction, which 
makes clauses unreachable (infinite distance) . In the parallel case, there are also 
subdivision and communication: subdivision makes the bounded search spaces 
for the parallel processes smaller (advantage of parallelism), while communica- 
tion undoes in part the effect of the subdivision (disadvantage of parallelism). 
The parallel bounded search spaces for the parallel derivation as a whole measure 
also the cost of duplicated search due to overlapping processes. 

Section 6 applies these tools to the analysis of distributed-search con- 
traction-based strategies. In distributed search, eager contraction depends 
on communication (e.g., to bring to a process a needed simplifier). We discover 
two related patterns of behavior, called late contraction and contraction undone, 
where eager contraction fails. It follows that search paths that eager contraction 
would prune are not pruned. While in a sequential derivation the bounded search 
spaces decrease monotonically due to contraction, in a parallel derivation they 
may oscillate non-monotonically, because they reflect the conflict of subdivision 
and communication, and the conflict of contraction and communication. How- 
ever, the incidence of late contraction and contraction undone decreases as the 
speed of communication increases, and at the limit, if communication takes no 
time, they disappear. For the overlap, we give sufficient conditions to minimize 
it relying on local eager contraction, independent of communication. 

The last task is to compare a sequential contraction-based strategy C with 
its parallelization C . We prove that if C has instantaneous communication and 
minimizes the overlap, its parallel bounded search spaces are smaller than those 
of C. On one hand, this result represents a limit that concrete strategies may 
approximate. For instance, this theorem justifies formally the intuition about 
improving performance by devising subdivision criteria that reduce the over- 
lap (e.g., [6]). On the other hand, since the hypothesis of instantaneous com- 
munication is needed, it represents a negative result on the parallelizability of 
contraction-based strategies, which contributes to explain the difficulty with ob- 
taining generalized performance improvements by parallel theorem proving. 

This kind of analysis is largely new, especially for parallel strategies. Most 
studies of complexity in deduction analyze the length of propositional proofs as 
part of the NP co—NP quest (e.g., [22]), or work with Herbrand complexity 
and proof length to obtain lower bounds for sets of clauses (e.g., [16]). The study 
in [20] analyzes measures of duplication in the search spaces of theorem-proving 
strategies. The full version of this paper, with the proofs and more references, 
can be found in [7]. 
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2 Parallel theorem-proving strategies 

In the inference system of a theorem-proving strategy, expansion rules (e.g., reso- 
lution) generate clauses, while contraction rules (e.g., subsumption and simplifi- 
cation) delete or reduce clauses according to a well-founded ordering (e.g., the 
multiset extension of a complete simplification ordering on atoms) . An inference 
rule can be seen as a function which takes a tuple of premises and returns a set 
of clauses to be added and a set of clauses to be deleted: 

Definition 1. Let 0 be a signature, Cq the language of clauses on 0, and 
V{£e) its powerset. An inference rule /" of arity n is a function — > 

■p(£e) X P(£e). (If /” does not apply to x, /"(x) = (0,0).) 

Given x = {(pi . . . (/?„), let X be the multiset {pi . . . Pn}, and 7ri(x, y) = x and 
TT 2 {x,y) = y the projection functions: 

Definition 2. Given a well-founded ordering (£©, :^), /" is an expansion rule if 
Va; e Cq, 7T2(/”(x)) = 0. It is a contraction rule w. r. t. if either 7Ti(/”(a;)) = 
7T2(/”(5)) = 0, or 7T2 (/”(x)) ^ 0 and A-7r2(/”(x))U7ri(/”(x)) ^mui 7T2(/”(x)). 

The closure of a set of clauses S with respect to an inference system I is the 
set S* = \Jk>ol''iS), where J°(S') = S, J'=(S') = I{I^-^{S)) for k > I and 
/(S') = S' U G TTlifipi ■ ■ ■ Pn)), f & I, Pl-.-Pn^S}. 

Glauses deleted by contraction are redundant, and inferences that use redun- 
dant clauses (without deleting them) are also redundant (e.g., [9,4]). Using the 
notion of redundancy criterion [3], R{S) denotes the set of clauses redundant 
with respect to S according to R. A redundancy criterion R and a set of con- 
traction rules Ir correspond if whatever is deleted by Ir is redundant according 
to R (7T2(/”(a;)) C R(X — 7T2(/"(x)) U7ri(/”(x)))), and if p & R(S) S, Ir can 
delete p without adding other clauses to make it redundant (7 Ti(/”(x)) = 0 and 
7T2(/”(5)) = W})- Ir and R are based on the same ordering. Let I = Ir tJ Ir, 
distinguishing expansion and contraction rules in I. 

Next, a parallel strategy has a system M of communication operators, such 
as receive: Cq V{Ce) x V{Ce), and send: Cq V{Ce) x P{Ce), where 
receive(x) = (x, 0) (adds received clauses to the database of the receiver), and 
send{x) = (0, 0) (sending something does not modify the database of the sender). 

The other component of a strategy is the search plan, which chooses inference 
rule and premises at each stage of a derivation So ■ Si where Si is the 

state of the derivation after i steps, usually the multiset of existing clauses. We 
use States for the set of states and States* for sequences of states. In distributed 
search, the search plan also controls communication and subdivision. Since S*j is 
infinite and unknown, the subdivision is built dynamically, at stage i the search 
plan subdivides the inferences that can be done in Si. For each process pk, an 
inference is either allowed (assigned to pk), or forbidden (assigned to others): 
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Definition 3. A parallel search plan is a 4-tuple E = {(, a, uj): 

1. The rule-selecting function (: States* xNxN^JUM takes as arguments 
the partial history of the derivation, the number of processes and the iden- 
tifier of the process executing the selection, and returns an inference rule or 
a communication operator. 

2. The premise-selecting function f: States* x IN x IN x (J U M) — > also 

takes in input the selection of and satisfies ^{{Sq . . . Si), n, k, /*”) G S'™. 

3. The subdivision function a: States* x IN x IN x (J U M) x Cq Bool also 
takes as argument the selection of and returns true (process allowed to 
perform the step), or false (forbidden), or T (undefined). 

4. The termination- detecting function co: States Bool returns true if and 
only if the given state contains the empty clause. 

Definition 4. Given a theorem-proving problem S, the parallel derivation gen- 
erated by a strategy C = (J, M, S), with E = {(, a, u>), for processes po . . .pn-i 
is made of n asynchronous local derivations S = Sq \~c ■ ■ ■ Sf \~c ■ ■ ■, s. t. Vfc, 
0 < fc < n — 1, Vi > 0, if (v{Sf) = false, ({{Sq . . .Sf),n,k) = f, either 
f = receive and x is received, or f ^ receive and ^{{Sq . . .Sf),n,k,f) = x, 
and a((S^ . . . Sf),n,k, f,x) = true, then Sf+i = Sf U7 Ti(/(x)) - 7T2(/(x)). 

A sequential search plan E = {(,^,uj) has f: States* — > I, f: States* x 
/ — > and uj. A parallel search plan E' = {(',^',a,u>) is a parallelization 
by subdivision of E, if: whenever C^((Sq . . .Si),n,k) G I, CiiSo ■ ■ .Si),n,k) = 
C(So...S,); and f'{{So . . . S,),n,k, f) = C((Sq . . . S,), /). C' = {I,M,E') is a 
parallelization of C = (J, E), if E' is a parallelization of E. 

3 Monotonicity, fairness and eager contraction 

If C and f can select a certain / and x, a needs to be defined on their selection, 
and it should not “forget” its decisions when clauses are deleted: 

Definition 5. A subdivision function a is total on generated clauses if for all 
S'o b . . .S'* h . . ., /c, n, /™ and x G (Uj=o'S'i)™> a{{So ■ . .S^),n,k, f'^,x) 

Since it is undesirable that permission changes after it has been decided, a 
is required to be monotonic w. r. t. T< false and T< true: 

Definition 6. A subdivision function a is monotonic if for all S'o b . . . 5'^ h . . ., 
n, k, f, X, and i> 0, a{{So . . . Si), n, k, f, x) < a{{So . . . S^+i),n,k,f, x). 

A strategy is complete if it succeeds whenever So is inconsistent. Complete- 
ness is made of refutational completeness of the inference system, and fairness 
of the search plan [9]. A sufficient condition for fairness is (e.g., [3]): 

Definition 7. A derivation S'o b . . . b . . . is uniformly fair w. r. t. I and R 
if I{Soc - R{Soo)) Q Ui>o where Soo = Uj>o Chyj (persistent clauses). 
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In distributed search a process only needs to be fair with respect to what is 

allowed. Since a is monotonic, allowed (3j a{{SQ . . . Sj),n, k, f,x) = true) is 

equivalent to persistently allowed (3i Vj > i C({{Sq . . .S^),n, k, f, x) = true). 

Definitions. A local derivation Sq h ■■■Sf h ... is uniformly fair w.r.t. 
I, R and a, if (p G 7ri(/™(x)), x G {S^ — and 3i > 0 Vj > i 

a{{S^ . . . S^), n, k, x) = true imply p G S!f. 

The search plan needs to ensure that for every inference from a tuple of persis- 
tent (in Soo = Ufe=o ) non-redundant premises, there is a pk which is allowed 
{fairness of subdivision) and has all the premises {fairness of communication) : 

Definition 9. A parallel derivation Sq\~c ■ ■ ■ Sf \~c ■ ■ ■, for k G [0, n — 1], by 
E = (C, a,w), is uniformly fair w . r. t. I and R, if: 

1. Vpfe, Sq \~c ■ ■ ■ Sf \~c ■ ■ ■ is uniformly fair w. r. t. I, R and a (local fairness). 

2. V/*” G /, Vx G {Soo - s. t. 7Ti(/’”(x)) 0, 3 Pfe S. t. X G 

{S^ — i?(S'oo))™ {fairness of communication), and 3i > 0, s. t. Vj > i, 
C({{Sq . . .Sj),n, k, /™, x) = true (fairness of subdivision). 

If E is uniformly fair, its parallelization E' inherits local fairness from E. 

Theorem 10. If a parallel derivation Sq\~c . . .\~c Sf he . . ., for k G [0, n — 1], is 
uniformly fair with respect to I and R, then I{Soo — R{Soo)) Q Ufe=o Uj>o ^j- 

A sequential strategy is contraction-based if it features contraction rules and 
an eager-contraction search plan: 

Definition 11. A derivation S'o b/ . . . S'i b/ . . . has eager contraction, if for all 
i > 0 and p G Si, if there are /*” G Ir and x G S'™, such that 7T2 (/™(x)) = {p}, 
then 31 >i such that Si b S;+i deletes p, and Vj, i < j < I, Sj b Sj+i is not an 
expansion inference, unless the derivation succeeds sooner. 

The parallelization of eager contraction is difficult. First, each local deriva- 
tion needs to have local eager contraction, which is defined as in Def. 11 with 
expansion replaced by expansion or communication. Second, the strategy should 
ensure that pk delete eagerly a clause reducible by a premise generated by ph. 
This depends on communication: 

Definition 12. Let p G Soo~R{Soo) and i be the first stage s. t. (p G lJ/l=o ^ 
parallel derivation has propapatzon of clauses up to redundancy if Wph 3j p G S^, 
instantaneous propagation of clauses up to redundancy if Vp/i p G S(^. 

Definition 13. A parallel derivation has distributed global contraction, if for all 
Pk, i > 0, and p G Sf , if there are /™ G Ir and x G (U^=o such that 
7’'2(/™(^)) = W}, then 31 > i such that Sf b deletes p, unless pk halts 
sooner. It has global eager contraction if, in addition, Vj, i < j <1, S^ h is 
neither an expansion nor a communication step. 
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Global eager contraction generalizes eager contraction to parallel derivations, 
while distributed global contraction is a weaker requirement which guarantees 
contraction, but not priority over expansion. In the presence of communication 
delays, pk may use as expansion premise a clause which is reducible with respect 
to U^=o before it becomes reducible with respect to 5'^. Thus, we have: 

Lemma 14. Local eager contraction and propagation of clauses up to redun- 
dancy (instantaneous propagation of clauses) imply distributed global eontraction 
(global eager eontraction, respectively). 

A parallel strategy is eontraction-based if it has contraction rules and its 
search plan has local eager contraction and distributed global contraction. 

Theorem 15. Let C — {I, E) be a contraction-based strategy and C with E' = 
{(' ,0 ,a,uj') a parallelization by subdivision ofC. If E' propagates clauses up to 
redundancy, and for all f G Ir, i, n, k and x, cx{{Sq . . . Sf),n, k, f,x) = true, 
C is also contraction-based. 

The requirement that all contractions are allowed to all is strong, especially if 
contraction is not only deletion but also deduction (e.g., simplification). Assume 
that a forbids contractions, e.g., f{x) = {{'fi , . . . , 'f’m}, Wi, • • • , Consider 
a strategy that lets a process delete the pj, but not generate the f>i, when ( 
selects /, f selects x, and a forbids the step. It is sufficient that at least one 
process generates and propagates the ifi to preserve completeness: 

Theorem 16. Let C and C be as in Theorem 15. If E' propagates clauses up to 
redundancy, and whenever ({{Sq . . . Sf), n, k) = f G Ir, ^{{Sq . . . Sf),n, k, f) = 
X, a{{SQ . . . Sf),n, k, f,x) = false, it is = Sf — TT 2 {f{x)), C is also 

contraction- bas ed. 

4 Search graphs for parallel search 

Given a theorem-proving problem S and an inference system I, the search space 
induced by S and I is represented by the hypergraph G{S)) = (F, E, I, h), where 
V is the set of vertices, I is a vertex-labeling function hV ^ Lq) = (from 
vertices to equivalence classes of variants, so that all variants are associated to a 
unique vertex), h is an arc-labeling function h: E ^ I, and if . . . , tprn) = 

{{tpi, ..., tps}, { 71 . • • • . 7p}) for G I, E contains a hyperarc e = (ui . . . 

. . .Wp]Ui . . .Us) where h{e) = f'^ , n -\- p = rn, and 

“ Vj 1 < j < n l{vj) = ifj and pj ^ { 71 , . . . , 7 ^} (premises not deleted), 

“ Vj 1 < j < p l{wj) = jj (deleted premises), and Vj 1 < j < s l{uj) = tpj 
(generated clauses). 

W.l.o.g. we consider hyperarcs in the form (vi . . .Vn',w;u). Contraction infer- 
ences that purely delete clauses are represented as replacement by true (a dummy 
clause such that p >- true), and a special vertex T is labelled by true. 

The search graph G{S() = (V,E,l,h) represents the static structure of the 
search space. The dynamics of the search during a derivation is described by 
marking functions for vertices and arcs: 
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Definition 17. A parallel marked search-graph (V, E,l,h, s,c) for po, . . .pn-i 
has an n-tuple s of vertex-marking functions s^'.V ^ Z such that 

{ TO if m variants (m > 0) of l(v) are present for process pk, 

— 1 if all variants of l{v) have been deleted by process pk, 

0 otherwise; 

and an n-tuple c of arc-marking functions c^: E ^ ¥l x Bool such that 7Ti(c^(e)) 
is the number of times pk executed e or received a clause generated by e, and 
7T2(c^(e)) = true/ false ii pk is allowed/forbidden to execute e. 

Hyperarc e = (t;i . . .Vn] w, u) is enabled for pk if s^{vj) > 0 for 1 < j < n, 
s^{w) > 0, and 7T2(c^(e)) = true. 



Definition 18. A parallel derivation induces n successions of vertex-marking 
functions {sf}i> 0 ) one per process. For all v G V, Sq(v) = 0, and Vf > 0: 

— If at stage i pk executes an enabled hyperarc e = (ui, . . . , tc; u): 

( s'/{v) - 1 if u = w (-1 if s/iv) = 1), 
sf+i(u) = < (u) + 1 if w = M (-hi if s/{v) = -1), 

[ S^ (w) otherwise. 

— If at stage i pk receives x = {ipi, where ipj = l{vj) for I < j < n: 

A + lifue (+1 ifSi(u) = -1), 

otherwise. 

— If at stage i pk sends x, sf^i{v) = sf{v). 

Note that Sg (v) = 0 also for input clauses: the steps of reading or receiving input 
clauses are included in the derivation (read steps can be modeled as expansion 
steps), because the subdivision function applies to input clauses. 



Definition 19. A parallel derivation induces n successions of arc-marking func- 
tions {c/}i>o: Va G E, 7Ti(cg(a)) = 0 and 7T2(cg(o)) = true, and Vt > 0: 



7Ti(cf+i(a)) 



7Ti(cf (a)) H- 1 if pfc executes a or receives a clause generated by a, 
TTi{c/{a)) otherwise; 



7r2(cf+i(a)) 



a((S'o . . . St+i),n, k, f,x) if a((5o . . . Si+\_),n,k, f,x) 

true otherwise (arcs allowed by default), 



where h{a) = / and x is the tuple of premises of hyperarc a. 
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5 Measures of search complexity 

Let G = (V, E, I, h) be a search graph. For all f G F, if w has no incoming 
hyperarcs, the ancestor- graph of v is the graph made of v itself; if e = (ni . . . v) 
is a hyperarc and t\ . . .tn are ancestor-graphs of the graph with root v 

connected by e to is an ancestor-graph of v, denoted by (u; e; (ti, . . . , t„)). 

(To simplify the notation, we include the deleted premise in . . . t)„.) The set of 
the ancestor-graphs of f in G is denoted by atc{v) (or atc{>p))- From now on, G = 
(V, E, I, h, s, c) is the parallel marked search graph searched by po> • • -Pn-i, G^ = 
(V, E, I, h, s^, c^) is the marked search graph for pk and G^ = (V, E, I, h, s^, 
is the marked search graph for pk at stage t of a derivation. 

If a relevant ancestor of 99 in t e atc{>p) is deleted by contraction, it becomes 
impossible to reach (p by traversing t: 

Definition 20 . Let t = (n; e; (ti . . . t„)) G atciv) with e = {vi . . . Vn] v) . A 
vertex w G t, w ^ v, is relevant to u in t for pk {w G RevQk{t)) if either 
w G {ui . . .Vn} and 7 n(c^(e)) = 0, or 1 < i < n s. t. tc is relevant to Vi in ti 
for Pk- 

If 7 Ti(c^(e)) ^ 0 because pk executed e, deleting i/’ is irrelevant for p, since V’ has 
been already used to generate p. If 7 Ti(c^(e)) ^ 0 because pk received a variant 
of p generated by e, deleting ip is irrelevant for p, since p came from the outside. 

Given t G atc{p), the p(ast)- distance measures the portion of t that pk has 
visited; the f(uture)- distance measures the portion of t that pk needs to visit to 
reach p by traversing t; the g(lobal)- distance is their sum: 

Definition 21. For all clauses p, ancestor-graphs t G atcip) and processes pk- 

— The p-distance of p on t for pk is pdistQk{t) = | {tc | tc G t, s^{w) 7 ^ 0} |. 

— The f-distance of p on t for pk is 

{ 00 if < 0, or 

3w G RevQk{t), s’‘{w) < 0, 

I {tc I w G t, s^{w) = 0} I otherwise. 

— The g-distance of p on t for pk is gdistQk (t) = pdistQk (t) -\- fdistQk (t). 

The f-distance of p in G for pk is fdistQk (p) = minlfdistQk (t) | t G atc{p)}- 
The g-distance of p in G for pk is gdistQk{p) = min{gdistQk{t) \ t G atc{p)}- 

While infinite distance captures the effect of contraction, we consider next 
subdivision: 

Definition 22. An ancestor-graph t is forbidden for process pk if there exists an 
arc e in t such that 7 Ti(c^(e)) = 0 and 7 T 2 (c^(e)) = false. It is allowed otherwise. 

If Pk receives the clause that e generates, it is 7 Ti(c^(e)) 7 ^ 0, and t is no 
longer forbidden, because it is no longer true that forbidding e prevents pk from 
exploring t. This may happen only as a consequence of communication, because 
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e cannot be executed. Thus, subdivision forbids ancestor-graphs, reducing the 
search effort of each process, but communication may undo it. Indeed, a strategy 
employs communication to preserve fairness in presence of a subdivision. For 
fairness, it is sufficient that every non-redundant path is allowed to one process. 
If more than one process is allowed, their searches may overlap: 

Definition 23. pk and ph overlap on ancestor-graph t if t is allowed for both. 

An overlap represents a potential duplication of search effort, hence a waste. A 
goal of a parallel search plan is to preserve fairness while minimizing the overlap. 

Definition 24. The bounded search space within distance j for pk is the multiset 
of clauses space{G^,j) = ^y^yy^j'mulQk{v,j) ■ l{v), where each l{v) has mul- 
tiplicity mulGk{v, j) = \{t \ t G atG{v),t allowed for pk,0 < gdistGk{t) < j}|. 

The sequential bounded search spaces space{G,j) are defined in the same 
way, with multiplicity mulG{v,j) = \{t \ t G atciv), 0 < gdistcit) < j}|. 

Definition 25. The parallel bounded search space within distance j is the mul- 
tiset of clauses pspace{G,j) — ^^^yy^jPiTiulGiVjj) ■ l{v), where l{v) has mul- 
tiplicity pTOU^G(^',J) = [gmulG{v,j)/n\ with gmulG{v,j) = j). 

If Ph and Pk overlap on an ancestor-graph t, t is counted in both mulG^ (v,j) 
and mulGi‘{v, j) and twice in gmulG{v, j), reflecting the overlap. 

Theorem 26. If a is the constant function true (no subdivision), and no com- 
munication occurs, then for allpk, i>0 and) > 0, space{G^,j) = space{Gi, j), 
and pspace{Gi,j) = space(Gi,j). 

6 The analysis 

In a sequential derivation, if a clause p is deleted at stage i and regenerated 
via another ancestor-graph at some stage j > i, a, contraction-based strategy 
will delete it again, and will do so before using p to generate other clauses [10]. 
It follows that when p is re-deleted, it is still relevant on all ancestor-graphs 
where it was relevant at stage i. Therefore, it is possible to make the following 
approximation: if fdistG^ (t) is infinite, fdistG^ (t) can be regarded as infinite for 
all j > t. In a parallel derivation, the situation is more complex. First, we need 
to consider not only the possibility that a process pk regenerates p via another 
ancestor-graph, but also the possibility that pk receives another variant of p from 
another process, which may not be aware that p is redundant. Second, we need 
to consider not only deleted clauses, but also clauses such that fdistGi^ (p) = oo 
because a relevant ancestor has been deleted on every t e atG{p)- In a sequential 
derivation, it is impossible for p to appear at some j > i, because it cannot be 
generated, but in a parallel derivation, p may still be received from another 
process at some j > i. Thus, we prove a more general result: 




Analysis of Distributed-Search Contraction-Based Strategies 



117 



Lemma 27. In a derivation with local eager contraction, for allpk, i, and ip, if 
fdistQk{ip) = 00 (regardless of whether ip is deleted or made unreachable) and 

Sj{p) > 0 for some j > i (regardless of whether p is received or regenerated), 
there exists a q> j, such that s^{p) < 0 (hence fdistQk{p) = oo), and pk does 
not use p to generate other clauses at any stage I, j < I < q- 

Therefore, we can make the approximation that fdistQk (p) = oo implies 
Vj > i fdistQk (p) = 00 . This takes care of clauses that the strategy finds 
redundant. Consider a non-redundant clause p and an ancestor-graph t of p 
such that fdistQk (t) = oo because s'({tp) = —1 for a relevant ancestor -ip in t. 
For simplicity, let be a parent of p, with arc e from %p to p. Assume that ph has 
not deleted t/>, executes e, and sends to pk & p generated by e. The arrival of p 
at some stage r > i makes ^p irrelevant (7Ti(c^(e) = 1), so that fdistQk (t) ^ oo. 
There is irrelevance of contraction at ph, because the clause(s) that contract ip do 
not arrive at ph fast enough to delete ip before it is used to generate p. When ph 
finally deletes ip, this deletion is irrelevant to t, because ph has already executed 
e: we call this phenomenon late contraction. There is irrelevance of contraction 
at Pk, because the arrival of p from ph makes the deletion of ip irrelevant: we call 
this phenomenon contraction undone. Distributed global contraction guarantees 
that Ph will delete ip eventually, so that Sj{ip) = —1 for some j > f. It is sufficient 
that Ph executes e and generates p at a stage I < j, and p arrives at pk at a 
stage r > i, for this situation to occur. Thus, distributed global contraction is 
not sufficient to prevent late contraction and contraction undone. 

The following theorems integrate all our observations on subdivision, con- 
traction and communication: 

1. If Sf h S'f+i generates ip, then Vj > 0, space{G'(^^,j) <mui space{G); , j) . 
When Ip is generated, the subdivision function a may become defined on a 
tuple of premises x including ip. If a decides that an arc e with premises 
X is forbidden, ancestor-graphs including e become forbidden, so that the 
bounded search spaces become smaller. 

2. If S') h replaces Ip by ip', thenWj > 0, spaceiG);^^, j) <mui space{G'( , j) . 
A contraction step replacing ip by ip' prunes those ancestor-graphs whose 
distance becomes infinite because of the deletion of ip, and those ancestor- 
graphs which become forbidden as a consequence of the generation of ip' . 

3. If S'f h sends ip, then Vj > 0, space{G’(j^i, j) = space{G^,j). If 
Sf h S'f+i receives ip, >0,3 I < i, space{G^+^,j) >rnui space{Gf,j). 
When Pk receives ip, there may be three kinds of consequences: allowed 
ancestor-graphs may become forbidden (subdivision), reducing the multi- 
plicity of some clauses; forbidden ancestor-graphs may become allowed (sub- 
division undone) and relevant deleted ancestors may become irrelevant (con- 
traction undone), increasing the multiplicity of some clauses. However, since 
communication cannot expand the bounded search spaces, but only undo 
previous reductions, the resulting bounded search spaces are limited by the 
bounded search spaces at some previous stage. 
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These theorems show that the bounded search spaces capture all relevant phe- 
nomena: pruning by contraction, subdivision and cost of communication. While 
in a sequential derivation the bounded search spaces may either remain the same 
(expansion) or decrease (contraction), in a parallel derivation the bounded search 
spaces of a process may oscillate non-monotonically because of communication. 
The faster is communication, however, the lesser is the incidence of late con- 
traction and contraction undone; at the limit, if the strategy has instantaneous 
propagation of clauses up to redundancy, they disappear: 

Lemma 28. In a derivation with local eager contraction and instantaneous prop- 
agation of clauses up to redundancy, let e be an arc of t € ata{ip) which uses Tp 
and generates tp' G S'oo — R{Soo)- If = — 1 and ip G RevQk{f) for some pk: 

1. yph, Vj, s’p{tp) = —1 implies ip G RevQh{f) (what is relevant to one process 
is relevant to all: no late contraction). 

2. Vj > i, Ip & RevQkft) (what is relevant at a stage remains relevant at all 
following stages: no contraction undone). 

The approximation fdistQk (t) = oo Vj > i fdistQk (t) = oo can be made: 

i 

Theorem 29. In a derivation with local eager contraction and instantaneous 
propagation of clauses up to redundancy, if fdistQk (t) = oo and fdistQk (t) ^ oo 

* ^3 

for some 0 < i < j, there exists a q> j such that fdistQk (t) = oo. 

Next, we turn our attention to the overlap. We observe that two overlap- 
ping processes may generate variants of the same clause. The following property 
prevents different processes from generating variants of the same clause: 

Definition 30. A subdivision function a has no clause- duplication if for all ver- 
tices u ^T, for any two hyperarcs into u, e\ with inference rule / and premises x, 
and 62 with inference rule g and premises y, Vi > 0, if Q;((S'o, . . .Si),n, k, /, x) = 
true and a((5'o, • • • Si), n, h, g, y) = true, then k = h. 

This property is compatible with fairness, for which one allowed process is 
sufficient. We show next that the combination of no clause-duplication and local 
eager contraction minimizes the overlap. There are two kinds of overlap: one 
caused by the subdivision function itself when it allows the same arc to more 
than one process, and one caused by communication (e.g., 7 r 2 (c^(e)) = true and 
7 T 2 (c^(e)) = false but 7 ri(c^(e)) ^ 0). No clause-duplication avoids the first 
kind of overlap by definition. For the second one, assume that pk is the only 
process authorized to generate all variants of (p. By local eager contraction, if pk 
generates more than one variant of (p, all but one are deleted before being sent 
to any other process. Thus, pk may send to another process only one variant, 
and the same variant to all processes, so that: 

Lemma 31. In a derivation with local eager contraction and no clause- duplica- 
tion, for any clause p, if Ph is the only process allowed to generate p, 3r such 
that'ik ^h,\/i> r, Vj > 0, mulQkfp, j) < 1 (i.e, communication may make at 
most one forbidden ancestor-graph allowed). 
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We have all the elements to compare a contraction-based, uniformly fair 
strategy C = (J, 17) with a parallelization by subdivision C = (/, M, 17'), which 
is uniformly fair and contraction-based. Since C and C have the same inference 
system, the initial search space is the same (i.e., Gg = Go and space(GQ,j) = 
space(Go , j ) for all k and j ) . We compare first the behavior on redundant clauses, 
next on ancestor-graphs including redundant inferences, and then on the remain- 
ing ancestor-graphs. We begin by proving three preliminary lemmas: 

1. If p £ Si for some i, then 3pk 3j sueh that either >P & S^ or p G R{Sj). 

2. If p G R{Si) for some i, then Vpfe 3j such that p G R{Sj). 

3. If fdistci (p) = 00 for some i, then 3j s. t. V? > j, either fdistQk (p) = 
00 , or all t G ata{p) are forbidden for pk at stage 1. 

These allow us to show that all redundant clauses eliminated by C will be ex- 
cluded by C as well: 

Theorem 32 . If fdistciip) = oo for some i, then there exists an r sueh that 
for all i > r and j > 0, pmulciip, j) = 0. 

To show that all ancestor-graphs pruned by C are pruned by C', we need to 
use Lemma 28 to prevent late contraction and contraction undone, and this can 
be done only under the hypothesis of instantaneous propagation of clauses: 

Lemma 33 . Assume that C has instantaneous propagation of clauses up to 
redundancy. If fdistCi (f) = oo for some i, then for all pk there exists a j such 
that for all I > j, either fdistQk (t) = oo, or t is forbidden for pu at stage 1. 

The final lemma covers ancestor-graphs that are not pruned. Thus, we need 
a hypothesis on subdivision, and we assume that C has no clause-duplication: 

Lemma 34 . Assume that C has instantaneous propagation of clauses up to 
redundancy and no clause-duplication. If fdistQ. (p) ^ oo for all i, there exists 
an r such that for all i > r and j > 0, pmulc^{p, j) < mulQ,^{p,j). 



Theorem 35. IfC' has instantaneous propagation of clauses up to redundancy 
and no clause- duplication, Vj 3m s. t.\/i>m pspace{Gi, j) dimui space{Gi,j). 

Intuitively, a value j of the bound may represent the search depth required to 
find a proof. If the problem is hard enough that the sequential strategy does 
not succeed before stage m, the parallel strategy faces a smaller bounded search 
space beyond m, and therefore may succeed sooner. 

Theorem 35 is a limit theorem, in a sense similar to other theoretical results 
obtained under an ideal assumption. On one hand, it explains the nature of 
the problem, by indicating in the overlap and the communication-contraction 
node its essential aspects. On the other hand, it represents a limit that concrete 
strategies may approximate by improving overlap control and communication. 
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7 Discussion 

If it had been possible to prove that a contraction-based parallelization has 
smaller bounded search spaces without assuming instantaneous communication, 
there would have been a ground to expect a generalized success of distributed 
search, at least to the extent to which smaller bounded search search spaces 
mean shorter search. However, we found that this type of result does not hold. 
Therefore, a distributed-search contraction-based strategy may do better than its 
sequential counterpart, but it is not guaranteed to. When adopting distributed 
search, one expects that communication will have a cost, and contraction may 
be delayed. The trade-off is to accept these disadvantages in order to avoid 
synchronization (a method where parallel processes have to synchronize on every 
inference in order to enforce eager contraction would be hopeless). Also, one may 
conjecture that the advantage of subdivision will offset the cost of communication 
in terms of delayed contraction. Our analysis showed that this conjecture does 
not hold on the bounded search spaces. In summary, this analysis contributes to 
explain why the parallelization of efficient forward-reasoning strategies has been 
an elusive target. Furthermore, the explanation is analytical, rather than based 
solely on empirical observations. 

So little is known about complexity in theorem proving, and strategy analysis, 
however, that these findings should be regarded as a beginning, not a conclusion. 
In this paper we have tried essentially to determine whether distributed search 
may make the search space smaller by doing at least as much contraction as 
the sequential process and adding the effect of the subdivision. Accordingly, we 
have compared bounded search spaces by comparing the multiplicities of each 
clause. It remains the question of whether distributed search may take advan- 
tage by performing steps in different order, especially contraction steps, hence 
producing different search spaces. Thus, a first direction for further research 
may be to find other ways to compare the bounded search spaces, which may 
shed light on other aspects, and possibly other advantages, of distributed search. 
Another direction for future work is to apply the bounded search spaces to an- 
alyze multi-search contraction-based strategies. These issues may be connected 
to, or even require, the continuation of the analysis of sequential contraction- 
based strategies. In [10], we compared strategies with the same search plan and 
inference systems different in contraction power. The complementary problem 
of analyzing sequential strategies with the same inference system but different 
search plans still needs to be addressed. Finally, we have considered only forward- 
reasoning strategies; another line of research is to extend our methodology to 
subgoal-reduction strategies, such as those based on model elimination. 
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Abstract. Database and Artificial Intelligence applications are briefly 
discussed and it is argued that they need deduction methods that are 
not only refutation complete but also complete for finite satisfiability. 
A novel deduction method is introduced for such applications. Instead 
of relying on Skolemization, as most refutation methods do, the pro- 
posed method processes existential quantifiers in a special manner which 
makes it complete not only for refutation, but also for finite satisfiability. 
A main contribution of this paper is the proof of these results. 
Keywords: Artificial Intelligence, Expert Systems, Databases, Anto- 
mated Reasoning, Finite Satisfiability. 



1 Introduction 

For many applications of automated reasoning, the tableaux methods [32,16,34], 
[18] have the following advantages: They not only detect unsatisfiability but 
also generate models; they are close to common sense reasoning, hence easy to 
enhance with an explanation tool; and they are quite easy to adapt to the special 
syntax used in some applications. However, for most applications these methods 
suffer from the following drawbacks: They are often significantly less efficient 
than resolution based methods and they sometimes initiate the construction of 
infinite models, even if finite ones exist. 

In this paper, a novel approach is formally introduced, which aims at over- 
coming these drawbacks. Like the approach [25,11] it refines and extends, this 
method relies on resolution and “range restriction” for avoiding the “blind in- 
stantiation” performed by the 7 rule [32,16]. Thanks to range restriction, the 
proposed method can represent interpretations as sets of ground positive liter- 
als. This is beneficial for two reasons. First, it often considerably reduces the 
search space. Second, it is well suited in application areas such as Artificial In- 
telligence, Databases, and Logic Programming, where this representation of in- 
terpretations and models is usual. Instead of relying on Skolemization, as most 
refutation methods do, the proposed method uses the extended <5 rule, also called 
S* rule, proposed in [10,20,23]. This rule makes the method complete not only 
for refutation, but also for finite satisfiability. A prototype written in Prolog 
implements the proposed method and a refinement of it is used in a database 
application [8,9]. There, the method described in the present paper is informally 
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recalled, but neither formally stated, nor proved to be sound and complete. A 
main contribution of the present paper is the proof of these results. 



2 Applications 



In several application areas, specific techniques have been developed that can 
be expressed as a systematic search for models of first-order logic specifications. 
Diagnosis. The approach to diagnosis described in [28] relies on rules of the 
form Pi A ... A —> Cl V ... V Cm interpreted as follows: the premises Pi, ..., P„ 
are causes for symptoms Ci, ..., or Cm- Generating a diagnostic thus consists in 
building up models of both the set of rules and in selecting those models that 
satisfy the observed symptoms. Tableaux methods are very convenient for this 
purpose like described e.g. in [4]. Diagnosis in fact requires to seek for models 
that are as small as possible [5], for simpler explanations are to be preferred to 
redundant ones: This principle is known as “Occam’s razor” . 

Database View Updates. A database view can be defined as the universal 
closure of a rule of the form Pi A ... A P„ ^ C. Such a view gives rise to compute 
instances of C from instances of the Pi, thus making it possible not to blow up 
the database with “C data” . If the view, i.e. the set of derived “C data” , is to be 
updated, changes to the Pi corresponding to the desired view update have to be 
determined. This is conveniently expressed as a model generation problem [7,2]. 
Meaningful solutions to a view update problem obviously have to be finite. Thus, 
view updates can only be computed by model generators that are complete for 
finite satisfiability. 

Database Schema Design. In general, a database is “populated” from an 
initial database consisting of empty relations and views, and integrity constraints 
[19]. It is, however, possible that ill-defined integrity constraints prevent the 
insertion of any data. A model generator can be applied to detect such cases: 
populating the database will be possible if and only if its schema has a nonempty 
and finite model. The system described in [8,9] gives rise to verify this. 
Planning and Design. Solving planning and design problems can as well be 
seen as model generation. The specifications might describe an environment, the 
possible movements of a robot, a starting position, and a goal to reach. They 
can also describe how a complex object can be built from atomic components. 
In both cases, each finite model describes a solution while infinite models are 
meaningless. 

Natural Language Understanding. Interpreting natural language sentences 
is often performed by generating the possible models of a logic representation of 
the considered sentences. Consider e.g. the sentences “Anna sees a dog. So does 
Barbara.^' that can be expressed by 3x sees{Anna, x) A dog{x) and 3y sees{Bar- 
bara, y)Adog{y). Skolemization as well as the 6 rule of standard tableaux methods 
would only generate one model with two distinct dogs. In contrast, thanks to the 
extended S or i5* rule, the method described in the present paper would more 
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conveniently generate both, a model with a single dog and a model with two 
distinct dogs [14]. 

In the above mentioned applications, the models seeked for must be finite. 
Program Verification. In program verification, one tries to prove properties 
from programs, e.g. loop invariants. Often enough, program drafts do not fulfill 
their specifications. Model generators can be applied to (a logic representation 
of) the programs to generate “samples”, or “cases” in which a requirement is 
violated. These samples can then be used for correcting the programs under 
development. Clearly Occam’s razor applies: The simplest samples are preferable 
over larger ones, that would be interpreted as “redundant” by programmers. 
Theorem Proving. Refutation theorem proving can benefit from model gen- 
eration in a similar manner. If a conjecture C is not a consequence of a set 
S of formulas, then, applying a model generator to 5 U {-'C'} will construct 
counterexamples to the conjectured theorem, i.e. models of 5 U {^C}. These 
models can be used for correcting the conjecture C. Here again Occam’s razor 
applies: if counterexamples can be found, the “smallest” ones will better help in 
understanding the flaw in the conjecture than redundant counterexamples. 

Counterexamples to program specifications and conjectures do not have to 
be finite. For these applications, counterexamples that can be found in finite 
time, i.e. that are finitely representable, are sufficient. However, in case finite 
counterexamples exist, it is desirable to detect them. For this purpose, a model 
generator complete for “finite satisfiability” is needed. This is possible, since 
finite satisfiability is semidecidable [36]. For as well detecting unsatisfiability, 
one can rely on a refutation prover coupled with a finite-model finder such as 
FINDER [31] and SEM [37], as described e.g. in [30]. The present paper pro- 
poses instead a single deduction method complete for both, unsatisfiability and 
finite satisfiability. Arguably, this method is more convenient for many applica- 
tions. In particular, a single method is better amenable to user interaction and 
explanation as provided by the system [8,9]. For some applications - e.g. natu- 
ral language understanding - a single method is necessary [14] and the coupling 
approach of [30] is not applicable. 

Note that the applications mentioned here in general do not give hints for 
the size of the finite models seeked for. Note also that most applications require 
that the model generator constructs only “minimal models” in the sense of e.g. 
[11]. This related issue is beyond the scope of the present paper. 

3 Preliminaries 

Throughout this paper, a language with a denumerable number of constants, 
but without function symbols other than constants, is assumed. The interpre- 
tations (and models) considered are term interpretations (term models, resp.) 
that, except for their domains, are defined like Herbrand interpretations (models, 
resp.) [16]. The domain of a term interpretation X consists in all ground terms, 
here constants, occurring in the ground atoms satisfied by X augmented with 
an additional, arbitrary constant cq occurring in no ground atoms satisfied by 
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X. In contrast, the domain of an Herbrand interpretation consists in all possible 
ground terms constructible by constants and function symbols in the considered 
language. The mapping of an interpretation that assigns to every constant (resp. 
n-ary relation symbol) an element of the domain (resp. a n-ary relation over the 
domain) will be called assignment. 

A term interpretation is uniquely characterized by the set Q of ground atoms 
it satisfies, and will therefore be denoted by X{Q). If 5 is a set (finite set, resp.) 
of formulas and T{Q) a term model of S, there might be constants (finitely many 
constants, resp.) in S which do not occur in Q. These constants are assumed to 
be interpreted over the special constant cq which without loss of generality can 
further be assumed not to occur in S. The subset relation C induces an order 
< on term interpretations: X{Qi) < T{Q 2 ) iff Gi C G 2 - A term model of a set of 
formulas is said to be minimal, if it is minimal for <. 

The first-order language considered is assumed to include two atoms T and 
T that respectively evaluate to false and true in all interpretations. A negated 
formula ~^F will always be treated as the implication F ^ X. The multiple quan- 
tification \/xiX 2 ■ ■ ■ XnF, also noted Va;F if x is the tuple of variables X 1 X 2 ■ ■ ■ Xn, 
is a shorthand notation for yxiix 2 ■ ■ .VxnF. The notation VeF, where e denotes 
the empty tuple, is allowed and stands for the formula F. Except when oth- 
erwise stated, “formula” is used in lieu of “closed formula”. If x is a tuple of 
variables Xi . . .Xn and if c is a tuple of constants Ci . . . c„, then [c/x] will denote 
the substitution {ci/xi, . . . , c„/x„}. 

In the following familiarity with tableaux methods as introduced in e.g. 
[32,16,18] is assumed. 

4 Positive Formulas with Restricted Quantifications 

In this section, a fragment of first-order logic, that of “positive formulas with 
restricted quantifications” (short PRQ formulas), is introduced. Arguably, this 
fragment is convenient for applications. It is shown to have the same expres- 
sive power as full first-order logic. The intuition of PRQ formulas is that of 
so-called “restricted quantification” in natural language. The first time an ob- 
ject is referred to in a formula, i.e., when a variable is quantified, a positive 
expression called “range” specifies which kind of object is meant, like e.g. in 
Mx {employ ee{x)^ 3y (boss{y) Aworks-for{x, y))) . The underlined expressions 
are ranges for x and y, respectively. Note also the use of an implication (resp. 
conjunction) for introducing the range of a universally (resp. existentially) quan- 
tified variable. Ranges and PRQ formulas are defined relying on auxiliary notions 
that are first introduced. 

Definition 1. 

• Positive conditions are inductively defined as follows: 

1. Atoms except T are positive conditions. 

2. Conjunctions and disjunctions of positive conditions are positive conditions. 

3. F is a positive condition if F is a positive condition. 
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• Ranges for variables X\, . . . ,Xn are inductively defined as follows: 

1. An atom in which all of cci, . . . , and occur is a range for Xi,. . . , 

2. Ai V A 2 is a range for Xi, . . . ,Xn if both Ai and A 2 are ranges for Xi, . . . , 

3. Ai A A 2 is a range for Xi, . . . ,Xn if Ai is a range for Xi, . . . , and A 2 is a 
positive condition. 

4. 3y R is a range for Xi, Xn if i? is a range for y, Xi, Xn and ii Xi ^ y for 
alH = 1, . . . , n. 

• Positive Formulas with Restricted Quantifications (short PRQ formulas) are 
inductively defined as follows: 

1. Atoms (in particular A and T) are PRQ formulas. 

2. Conjunctions and disjunctions of PRQ formulas are PRQ formulas. 

3. A formula of the form P ^ F is a PRQ formula if P is a positive condition 
and F a PRQ formula. 

4. A formula of the form Vxi . . .Xn {R F) (n > 1) is a PRQ formula if R is 
a range for Xi, . . . ,Xn and if P is a PRQ formula. 

5. A formula of the form 3x (PAP) is a PRQ formula if P is a range for x and 
if P is a PRQ formula. 

Example 2. The formula P = 'ixy {p{x) A q{y) V r(x, y) — > 3z (s(z) A t{z))) is a 
PRQ formula, because p{x) A q{y) V r{x,y) is a range for x and y and s{z) is a 
range for x. The formula G = Mxy (p{x) V q{y) s{y)) in contrast is not a PRQ 
formula, since the premise of the implication is not a range for x and y. 

Note, that ranges are positive conditions. The following Lemma will be used 
in proving Theorem 21. 

Lemma 3. Let A4 and Af be sets of ground atoms such that AA C J\f and R a 
positive condition. IfT{M) |= P, then T{Af) |= P. 

Proof, (sketched) By induction on the structure of P. ■ 

For most applications the restriction to PRQ formulas is not a severe restric- 
tion since in general quantifications in natural languages are “restricted” through 
- implicit or explicit - sorts. Furthermore for every finite set T of first-order for- 
mulas there exists a finite set PRQ(P) of PRQ formulas with the “same” models 
as J- in the following sense: 

Theorem 4. (Expressive Power of PRQ Formulas) Let E be the signature 
of the first-order language under consideration, D a unary predicate such that 
D ^ E, E' = EiJ {D}. Then for every finite set T of first-order formulas over 
E there exists a finite set PRQ{T) of PRQ formulas over E' such that: 

1. If (D,m) is a model of T with domain T> and assignment function m and if 
m' is the mapping over E' defined as follows: 



m' (s) 



( m{s) if s D 
\V ifs = D 



then (V,m') is a model of PRQ (E). 
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2. If is a model of PRQ(T), then there exists T> C T>' such that 

{T>,m' li;) is a model of T , where m' \s is the restriction of m' to E. 

Proof, (sketched) Let iT be a finite set of formulas. Recall that there exists a 
finite set Q of formulas in prenex conjunctive normal form such that T and Q 
are logically equivalent. Recall also that a disjunction D — Di V • • • V of 
literals is equivalent to the implication P C with 1 P = Pi A . . . A Pfe if the 
set {-•Pi I f = 1, . . . , /c} of negative literals in D is nonempty, P = T otherwise, 
and 2 C = Cl V ... V Cm if the set {Ci \ i = 1, . . . ,m} of positive literals 
in D is nonempty, C = _L otherwise. Call “in implication form” the formula 
obtained from a formula in prenex conjunctive normal form by transforming 
each of its conjuncts into the above-mentioned, logically equivalent implication 
form. Hence, there exists a finite set T” of formulas in implication form which 
is logically equivalent to T . Let T' be the finite set of PRQ formulas obtained 
by applying the following transformation 7?. to the formulas in T"\ TZ{yxF) 
\/x{D{x) TZ{F)), TZ{3xF) := 3x{D{x) A TZ{F)), and TZ{F) := P if P is not 

a quantified formula. One easily verifies that PRQ(P) := T' UC(P) fulfills the 
condition of Theorem 4, where C(P) := {D{c) \ c constant occurring in P} if 
some constants occur in P, C(P) ;= {P(co)} for some arbitrary constant cq, 
otherwise. ■ 

The predicate D of Theorem 4 generalizes the domain predicate dom of 
[25] . Note that other, more sophisticated transformations of first-order formulas 
into PRQ formulas than that used in the previous proof are possible which, for 
efficiency reasons, are more convenient in practice. For space reasons, they are 
not discussed here. 

Corollary 5. For every finite set F of first- order formulas there exists a finite 
set PRQ{F) of PRQ formulas such that P is finitely satisfiable if and only if 
PRQ{F) has a finite term model. 

Proof. From Theorem 4 follows that for every finite set P of first-order formu- 
las there exists a finite set PRQ(P) of PRQ formulas such that P is finitely 
satisfiable if and only if PRQ(P) is finitely satisfiable. If PRQ(P) has a finite 
model A4, then a finite term model of PRQ(P) is obtained by a renaming of the 
elements of the universe of A4. m 



5 Extended Positive Tableaux 

Extended Positive tableaux, short EP tableau, are a refinement of the PUHR 
tableaux defined in [11] as a formalization of the SATCHMO theorem prover [25] . 
Other related formalizations are given in [13,3]. The refinement of EP tableaux 
consists of the processing of PRQ formulas instead of (Skolemized) clauses, and 
in a tableaux expansion rule for existentially quantified subformulas which, as 
opposed to the standard 5 rule [32,16], performs no “run time Skolemization” . 
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Example 6. Consider S = {p{a),\lx{p{x) — > 3yp{y))Y and a Skolemized version 
Sk(5) of S. Applied to Sk(5), the PUHR tableaux method initiates the con- 
struction of the infinite model {p(a), p(/(a)), p(/(/(a))), . . . }. A similar problem 
arises if the standard 5 rule is applied to 5: The finite model {p{a)} of S is not 
detected by the PUHR tableaux method. 



Definition 7. (EP Tableaux expansion rules) 

3 (or i5*) rule: 

3xE{x) 

E[c \ / I ... I A[c/j;/x] I E\^C<yiei^ / x\ 

where {ci . . . Cfe} is the set of all constants occurring in the expanded node, and 
where Cnew is a constant distinct from all Ci for i = 1, . . . ,k. 

PUHR rule: V rule: A rule: 

yx{R{x) ^ F) El V E 2 El A E 2 

F\c/x] El I E 2 El 

E2 



where R[c/x\ is satisfied by the interpretation specified by the expanded node, 
i.e., by the set of ground atoms occurring in that node. 

In the PUHR rule, c is a tuple of constants occurring in the expanded node. 
These constants are determined by evaluating R against the already constructed 
interpretation, i.e., the term interpretation determined by the set of ground 
atoms occurring in the node. This evaluation corresponds to an extension of 
positive unit hyperresolution. It coincides with (standard) positive unit hyper- 
resolution \i R ^ F has the form Pi A ... A Pn — > Ci V . . . Cm where the 
Pi {i = 1, . . . ,n) and Cj (j = 1, . . . ,m) are atoms. Recall that the notation 
Ve(i? ^ F), where e denotes the empty tuple, is allowed and stands for the 
formula R ^ F. Thus, the PUHR rule handles both, universally quantified and 
implicative formulas. 

Definition 8. (EP Tableau) If 5 is a set of formulas, Atoms(S) will denote 
the set of ground atoms in S. EP Tableaux for a set S of PRQ formulas are trees 
whose nodes are sets of closed formulas. They are inductively defined as follows: 

1. The tree consisting of the single node S is an EP Tableaux for S. 

2. Let T be an EP tableaux for S, L a leaf of T, and (p a formula in L that is 
not satisfied in the term interpretation T(Atoms(L)). Then the tree obtained 
from T by appending one or more children to L according to the expansion 
rule applicable to <p is an EP tableaux for S. Each child of L consists of L and 
one more formula or two more formulas in the case of p being a conjunction. 

Strictly, the syntax of Definition 1 would require 'ix {p{x) 3y (p(y) AT)). 



1 




A Deduction Method Complete for Refutation and Finite Satisfiability 



129 



A branch of an EP Tableaux is closed if it contains T. Otherwise, it is open. An 
EP tableaux is open if at least one of its branches is open; otherwise, it is closed. 
If is a branch in an EP tableau, then UB denotes the union of the nodes in B. 
An EP tableaux is satisfiable if it has a branch B such that UB is satisfiable. 

Note that the PUHR rule is the only expansion rule which can be applied 
more than once to the same formula along a branch of an EP tableau. Indeed 
the condition “that is not satisfied in the term interpretation T{Atoms{L))” 
prevents repeated applications of rules other than the PUHR rule. Note also that, 
although only finite EP tableaux can be constructed in finite time. Definition 8 
does not preclude infinite EP tableau. 

Example 9. Let 5i = {p(a), Vx {p{x) — > r{x) V 3y q{x, y))}.'^ The following table 
denotes a EP tableaux for 5i in the manner of [6] (to which the denomination 
“tableaux method” goes back): Successor nodes are right from their parent nodes, 
branching is represented vertically, and the nodes are not labeled with sets of 
formulas but with the single formula added at the corresponding node. 

Si r{a)\/3yq{a,y) r(a) 

3m(a,y) q{a,a) 

qi,a^ Cnew) 

Example 10. Let 52 = {emp?(co), Va: {empl{x) 3y works-for{x,y)),yxy 
{works- for{x,y) empl{x) A empl{y))}^ The following denotes an infinite EP 
tableaux for 52 (the predicates are abbreviated to their first letters): 

52 3yw{co,y) tc(co,co) 

w{co,ci) e(ci) 3yw{ci,y) w{ci,co) 

w{Ci, Cl) 

w{Ci,C2) e(C2) ... 

6 Refutation Soundness and Completeness 

The results of this section are established using standard techniques (cf. e.g. 
[16,11,18]). In the following, 5 denotes a set of PRQ formulas. 

Lemma 11. The application of an expansion rule to a satisfiable EP tableaux 
results in a satisfiable EP tableau. 

Proof, (sketched) For every expansion rule, one easily shows that if a node N of 
an EP tableaux is satisfiable, then there is at least one successor of N which is 
satisfiable. ■ 



Theorem 12. (Refutation Soundness) If there exists a closed EP tableaux 
for 5, then 5 is unsatisfiable. 

^ Strictly, the syntax of Definition 1 would require \/x{p{x) r{x) V 3y{q{x, y) AT)). 
® Definition 1 would require Va; {empl{x) 3y {works- for{x,y) AT)). 
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Proof. Assume S is satisfiable. By Lemma 11 there exists no closed EP tableaux 
for S. m 

The following definition formalises the standard concept of fairness cf. e.g. 
[16]. Recall that the nodes of an EP tableaux are sets of PRQ formulas. 

Definition 13. 

• Let T be an EP tableaux for S, and B a branch in T. Then UB is said to be 
saturated if the following holds: 

1. If EiV E 2 & US then Ei G \JB or E 2 € US. 

2. If El A E 2 G UB then Ei G G>B and E 2 G G>B. 

3. If 3xE{x) G yjB then there is E[x/ci\ G G>B, or . . ., or E[x/Cn] G US, or 
E[x/criew\ G UB. Ci,...,Cn are all constants occurring in the node that is 
expanded by the 3-rule, and Cnew is a constant, not occurring in this node. 

4. IfVa; (P(x) —> E) G UB, then for all substitutions a, such that T(Atoms(U,B)) 
1= i?cr, Ea G UB. 



• An EP tableaux T is called fair if UB is saturated for each open branch B of 
T. 



Lemma 14. (Model Soundness) Let T he an EP tableaux for S and B an 
open branch of T . IfT is fair, then T{Atoms{UB)) |= S. 

Proof, (sfcete/iedj By induction on the structure of PRQ formulas. ■ 



Theorem 15. (Refutation Completeness) If S is unsatisfiable, then every 
fair EP tableaux for S is closed. 

Proof. Immediate consequence of Lemma 14. ■ 



Corollary 16. If S is not finitely satisfiable, then every open branch of a fair 
EP tableaux for S is infinite. 

Proof. Assume that S is not finitely satisfiable. Assume there is a fair EP 
tableaux T with a finite open branch B. By Lemma 14 T{Atoms{UB)) is a 
finite model of S, a contradiction. ■ 

7 Finite Satisfiability Completeness 

The proof of the completeness for finite satisfiability of EP tableaux is more 
sophisticated than that of other theorems given in the previous sections. It makes 
use of nonstandard notions, that are first introduced. 

Definition 17. (Simple Expansion) Let 5 be a satisfiable set of PRQ formu- 
las, (f an element of S, and T{Q) a term model of S. Simple expansions S' of S 
with respect to 99 and T{Q) are defined as follows: 
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1. If is a ground atom, then S' := S. 

2. If tp = ifi A if 2 , then S' := {S \ {ip}) U {ipi,ip 2 \. 

3. If ip = ip\y ip 2 , then S' := {S \ {ip}) U {ipi} for one i G {1,2} such that 

h 

4. If ip = 3x ipi, then S' := {S \ {ip}) U {ipi[c/x\}, where c is a constant, such 
that T{Q) ^ ipi[c/x].^ 

5. If = Vx (i?(a;) ^ F), then 5' := (5\{(/9})U{F[x/c] | c a tuple of constants 
s.t. T{g) hi?[5/c]j.® 

Note that for every S, every element ip of 5, and every model T{Q) of S, 
there exists at least one simple expansion of S w.r.t. ip and 'T{Q). A simple 
expansion S' of S w.r.t. ip and T(^) differs from S whenever ip is nonatomic. 
The existence of a simple expansion S w.r.t. ip and T{Q) such that S ^ S' does 
not necessarily mean that some EP tableaux expansion rule can be applied to 
ip. Indeed, according to Definition 8 an expansion rule can only be applied if 
T{Atoms{S)) ip. Every simple expansion of a finite set S of PRQ formulas 
w.r.t a formula and a finite model T{Q) of S is finite. Because of 5. in Definition 
17 this is not necessarily the case if T{Q) is infinite. 

Lemma 18. Let S be a set of PRQ formulas, ip G S, T{£) a finite, minimal 
term model of S, and S' a simple expansion of S w.r.t. ip and 'T(S). PiS) is a 
minimal model of S' . 

Proof, (sketched) By a case analysis based on the structure of ip. m 



Definition 19. (Rank) 

• Let ip he a (non necessarily closed) PRQ formula and d a positive integer. The 
d-rank rk{ip, d) of a PRQ formula is inductively defined as follows: 

1. If ip is an atom, then rk{ip, d) := 0. 

2. Hip = ip\Aip 2 ^ or 99 = (^1 V ip 2 , or (^ = — > ip 2 , then rkfp, d) := rk{ip\,d) + 

rk{ip2, d) + 1. 

3. If ip = ^xfi:, then rk{ip, d) := rk{tp, d) + 1. 

4. If ip = 'ix'fi, then rk{ip, d) := rk{tp, d) x d'', where n is the size of the tuple 

X. 



• Let 5 be a set of PRQ formulas, P(S) a finite minimal model of S, and 
d the cardinality of the domain of P{£). The rank rk{S,T{£)) of S with re- 
spect to T{£) is defined by rk{S,T{£)) := rfc(V’, d) if Atoms{S) C £ and 

rk{S,T{£)) := 0 if Atoms{S) = £. 

Note that rk{S,T{£)) = 0 if and only if T{Atoms{S)) is a model of S. In 
other words rk{S ,T {£)) > 0 if and only if some EP tableaux expansion rule can 
be applied to some formula in S. 

Since T{G) |= S and p G S, T(Q) |= pi for at least one of i = 1, 2. 

® Such a constant necessarily exists since T (Q) |= S and p G S. 

® If there are no constants c such that T (Q) |= R[x/c], then S' S\ {p}. 
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Lemma 20. Let S be a finitely satisfiable set of PRQ formulas, ip G S, ip 
nonatomic, T{£) a finite minimal model of S, and S' a simple expansion of 
S wrtp andT{S). If rk{S,T{S)) ^ 0, then rk{S',T{£)) <rk{S,T{£)). 

Proof, (sketched) By a case analysis based on the structure of p. m 



Theorem 21. (Completeness for Finite Satisfiability) LetT{£) be a finite 
term model of S. IfT{£) is a minimal model of S, then every fair EP tableaux 
for S has a finite, open branch B such that, up to a renaming of constants, 
Atoms{UB) — £. 

The proof is based on a double induction. This is needed since the PUHR 
rule can repeatedly be applied to the same formula along the same branch. As a 
consequence, a measure of the syntactical complexity of the set of formulas, which 
would be a natural induction parameter, does not decrease after an application 
of the PUHR rule. This is overcome by a second induction on the number of 
applications of the PUHR rule to the same formula. 

Proof. Let T{£) be a finite, minimal term model of S. The proof is by induction 
on rk{S,T{£)). Induction hypothesis: 

(★) If is a set of PRQ formulas, if T(lT) is a finite, minimal term 

model of A4, and if rk{A4,T{iF)) < n, then every fair EP tableaux 

for Ai has a finite, open branch B s. t. up to a renaming of constants 

Atoms{UB) = T . 

Assume that rk{S,T{£)) = 0. S has therefore a single minimal term model, 
namely T{Atoms{S)), and every fair EP tableaux for S consists of one single 
node equal to S. Clearly, the result holds. 

Assume that rk{S,T{£)) = n > 0. Let T be a fair EP tableaux for S. Since S 
is satisfiable, by Theorem 12 T is open. Since rk{S,T{£)) > 0 there exists at 
least one formula p G S on which an expansion rule can be applied. Since T is 
fair, its root necessarily has successor(s). Let € 5 be the formula on which the 
application of an expansion rule yields the successor(s) of the root of T. 

Case 1: p = Pi A p 2 , or p = Pi V p 2 , or p = 3xtp. By Definition 17 there is at 
least one successor N of the root such that N = {p} U S' where, up to constant 
renaming in case p = 3xip, S' is a simple expansion of S w.r.t. p and T{£). Since 
an EP tableau expansion rule cannot be applied more than once to a formula like 
p, the tableaux rooted at N is an EP tableaux T' for the simple expansion S' . 
T' is fair because so is T. For every simple expansion S' of S w.r.t. p and T{£), 
by Lemma 18, 'T{£) is a minimal model of S'. Since rk{S,T{£)) > 0 and p is 
nonatomic, by Lemma 20 rk{S' ,T{8)) < rk{S,T{£)). Therefore, by induction 
hypothesis (*) , the tableaux rooted at N has a finite open branch B' such that, 
up to a renaming of constants, Atoms{UB') — £. Hence, the same holds of T. 
Case 2: p = 'ix{R{x) F). Let S' be the (unique) simple expansion of S w.r.t. 
p and T{£). Along a branch of the fair EP tableaux T for S, the PUHR rule is 
possibly applied more than once to p. Therefore, the tree rooted at the successor 
N of the root of T is not necessarily an EP tableaux for S' . In the following it 
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is shown how parts of T can be regarded as parts of an EP tableaux for S'. 
For n G IN and a branch B of T, let ,8” denote the prefix of B up till (and 
without) the (n + l)-th application of the PUHR rule on (p, if the PUHR rule is 
applied more than n times to p in B; otherwise, let 8” := B. The following is 
first established by induction on n: For all n G IN \ {0}, 

(*★) T has a branch B s.t. up to a renaming of constants Atoms{B'") C £. 

Case 2.1: n = 1 : The successor N of the root of T results from an application 
of the PUHR rule to p = t/x{R{x) F), i.e., by Definition 7 and 8, there is 
a set Q of ground atoms and a substitution cr such that Q C S, T{Q) ^ Ra, 
and T{Q) ^ Fa, and N = S U {Fa}. Since by hypothesis T{£) is a model 
of S, G C £ and by Lemma 3 T{£) ^ Ra. Furthermore, since T{£) |= p, 
T{£) \= Fa. Since S' is by hypothesis the (unique) simple expansion of S w.r.t. 
p and T {£), Fa G S' . So, there is an EP tableaux T' for S' , which coincides with 
T from N until the second application of the PUHR rule on p in all branches. 
Since by Lemma 18 T{£) is a minimal model of S' and since by Lemma 20 
rk{S' ,T {£)) < rk{S,T{£)), the induction hypothesis (*) is applicable: There is 
a branch B' in T' with, up to constant renaming, Atoms{CiB') — £. So, for the 
corresponding branch 8 in T Atoms{B^) C £. 

Case 2.2: n > 1: Assume that (★★) holds for all m < n. Let B\,...,Bk be all 
such branches of T. 

If for some i = 1, . . . ,k Bf = 8”''"^ = Bi, i.e. the PUHR rule is applied at most 
n times to p along Bi, then by induction hypothesis (**) Atoms{B'l'^^) C £. 
Otherwise, since by induction hypothesis (**) Atoms{B}') C £, by Lemma 3 and 
by definition of S' each formula Fai resulting from an (n + l)-th application 
of the PUHR rule to p in the branch Bi is in S' . Therefore, an EP tableaux 
T' for S' can be constructed from the subtree of T rooted at N as follows: 
First, replace iV by 5'. Second, keep from each branch Bi only the prefix . 
Third, remove from each B}}^^ those nodes resulting from applications of the 
PUHR rule to p. Fourth, cut all other branches immediately before the first 
application of the PUHR rule to p. T' is a finite EP tableaux for S' , which 
is not necessarily fair. Since T' is finite, a fair EP tableaux T" for S' can be 
obtained by further expanding T' . By Lemma 20, rk{S' ,T{£)) < rk{S,T{£)). 
By induction hypothesis (*), T" has a branch B' with, up to constant renaming, 
Atoms{yjB') = £. By definition of 8i, . . . ,Bk and T" there is a branch Bi in T 
such that Atoms{B'l'^^) =Atoms{B'{'^^). Hence, Atoms{B}'^^) C £. 

Since by hypothesis T{£) is finite, T has a finite branch B for which (**) 
holds. Hence, this branch is open. Since T is fair, by Lemma 14 T {Atoms{UB)) ^ 
S and since T{£) is minimal, up to a renaming of constants, Atoms{UB) = £. m 

It follows from Theorem 21 that a breadth- first expansion of EP tableaux is 
complete for finite satisfiability. A depth-first expansion of EP tableaux is not 
complete for finite satisfiability, as Example 22 shows. However, by theorem 15, 
a depth-first expansion of EP tableaux is complete for unsatisfiability. 




134 Frangois Bry and Sunna Torge 



Example 22. Consider the following PRQ formulas: Fi = s{a,b), F 2 = \lxy 
(s{x,y) 3z s(y,z)), F 3 = \!xyz {s{x,y)As{y,z) s{x,z)), F 4 = \/xy {s{x,y)A 

s{y,x) _L). Let S 3 = {Fi, F 2 , F 3 , F 4 }. The models of S 3 are infinite, as the 

following EP tableau shows: 

S 3 3zs{b,z)) s{b,a) T 

s{b, b) T 



s(6. Cl) s(a. Cl) 3z s{c\,z)) s(ci,a) 


T 


s(ci,6) 


T 


s(ci,ci) 


T 


s(Ci,C2) 


3z s(c 2 ,z)) 



Consider now G = (Fi A F2 A F3 A F 4 ) V p and ^4 = {G}. A depth-first, left- 
most expansion of an EP tableaux for ^4 first starts the construction of an EP 
tableaux for the subformula (Fi A F2 A F3 A F4), i.e. of an infinite EP tableaux 
similar to an EP tableaux for ^3, and thus never expands the finite EP tableaux 
for the subformula p. 

8 Implementation 

A concise Prolog program, called FINFIMO (FINd all Finite MOdels), in the 
style of [25] implements a depth-first expansion of EP tableau. For space reasons, 
this implementation is not commented here. It is given in [9] and available at: 
http://www.pms.informatik.uni-muenchen.de/software/finfimo/ . First experi- 
ments as well as the database application presented in [8,9], which is based on 
an implementation of a breadth-first expansion of EP tableau [8,9], point to an 
efficiency that is fully acceptable for the applications mentioned in Section 2. 

9 Related Work 

EP tableaux are related to approaches of different kinds: 1 Generators of models 
of (or up to) a given cardinality, 2 coupling of generators of models of (or up to) 
a given cardinality with a refutation prover, 3 tableaux methods that rely on the 
extended 6 or 6 * rule first introduced in [10], 4 tableaux methods that make use 
of the 7 rule instead of a PUHR rule, and 5 generators of finitely representable 
models. 

1 Possibly, one of the first generator of models of a given cardinality has been 
described in [21]. Nowadays, among the best known generator of finite models 
of (or up to) a given cardinality are FINDER [31] and SEM [37]. Their strength 
lies in a sophisticated, very efficient implementation of the exhaustive search for 
models up to a given cardinality. Most generators of finite models up to a given 
cardinality can continue the search with a higher cardinality, if no models of the 
formerly given cardinality can be found. However they require an upper bound 
for the cardinality of the models seeked for. For the applications mentioned 
in Section 2, e.g. the application described in [8], this might be too strong a 
requirement. Moreover, model generators for given cardinalities such as FINDER 
and SEM cannot detect unsatisfiability. 
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2 Model generators for given cardinalities such as FINDER [31] and SEM [37] 
can be coupled with a refutation prover as described e.g. in [30], resulting in a 
system complete for both finite satisfiability and unsatisfiability. Arguably, this 
approach is less convenient for many applications than the approach described 
here. In particular, for natural language understanding [14] and for an interactive 
system as described in [8,9] a single deduction system as proposed in the present 
paper seems preferable. 

3 The extended S (or i5*) rule used in EP tableaux has been proposed in former 
studies. It has been mentioned several times, to the best of our knowledge first 
in [10,20,23]. It is by no means always the case that this rule results in a loss of 
efficiency for refutation reasoning, for it sometimes gives rise to sooner detection 
of (finite) satisfiability. This is e.g. the case with the formula p{a) A Vx {p{x) 
^yp{y))- Although it has a finite model with only one satisfied p fact, most 
refutation provers as well as tableaux methods using the standard S rule expand 
an infinite model. Such an example is by no means unlikely in applications. An 
“unprovability proof system” is proposed in [35] which is complete for finite 
falsifiability.^ Although it relies on two rules that remind of the extended 5 or 
S* rule, it is not complete for both, unsatisfiability and finite falsifiability. It is 
suggested in [35] to couple it with a refutation method to achieve completeness 
for both properties. 

4 The approach described in the present paper differs from most tableaux meth- 
ods in the use of the PUHR (positive unit hyperresolution) rule, whose introduc- 
tion in a tableaux method has been first proposed in [25] . Formalizations of this 
approach [25] have been given in e.g. [11,3]. The PUHR rule avoids the “blind 
instantiation” of the 7 rule in those - frequent - cases, where the D predicate 
of theorem 4 is not needed. In such cases, the gain in efficiency compared with 
tableaux methods relying on the 7 rule can be considerable [25]. In the imple- 
mentation described in [23] the blind instantiation of the 7 rule is controlled by 
giving a limit on the number of 7 expansions for each 7 formula. In practice, 
conveniently setting such upper bounds might be difficult. A further interest of 
the approach presented here is its short and easily adaptable implementation 
given in [9]. 

5 Other extensions and refinements of tableaux methods generate finite represen- 
tation for (possibly infinite) models [12,33,15,27]. In [27] a method for extracting 
models of (possibly infinite) branches by means of equational constraints is de- 
scribed. The approaches [33,15] are based on resolution and therefore are much 
more efficient than approaches based on the 5 rule of classical tableaux methods. 
In contrast to the method described in the present paper, the method described 
in [33] only applies to the monadic and Ackermann class. The method of [15] 
which, like the PUHR [25,11] and EP tableaux, is based on positive hyperresolu- 
tion, avoids splitting. In some cases, this results in gains in efficiency. This also 
makes the method capable of building (finite representations of) infinite mod- 
els for formulas that are not finitely satisfiable. The goal of EP tableaux being 
completeness for both, refutation and finite satisfiability, the capability of the 

Rather unappropriated called “finite unprovability” in [35]. 
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method of [15] to build finite representation of infinite models cannot be taken 
as comparison criterium: Adding it to the EP tableaux method would make it 
incorrect with respect to finite satisfiability, indeed. Recall that for the applica- 
tions outlined in Section 2, that motivated the present paper, finite models are 
needed and finitely representable models are not convenient. 

It is difficult to compare the EP tableaux method with refutation methods 
for clausal axiom systems, for it does not make much sense to check Skolem- 
ized axiom systems for finite satisfiability. Except in trivial cases, Skolemization 
expands finite Herbrand as well as term models into infinite ones, indeed. 

10 Conclusion and Perspectives 

Some applications of theorem proving have been discussed that can benefit from 
a model generator complete for both, refutation and finite satisfiability and fur- 
thermore not imposing an upper bound on the size of the models searched for. 
The approach Extended Positive (EP) Tableau, developed for such applications, 
has been presented. Like the PUHR Tableau [25,11] they extend, EP Tableaux 
rely on positive unit hyperresolution and “range restriction” for avoiding the 
“blind instantiation” performed by the 7 rule of standard tableau [16,34,18]. 
Instead of relying on Skolemization, as most refutation methods do. Extended 
Positive Tableaux use the extended 6 (or 5*) rule of [10,20,23]. It was shown 
that this rule makes the Extended Positive Tableaux method complete not only 
for refutation, like standard tableaux methods, but also for finite satisfiability. 
A prototype written in Prolog given in [9] in the style of SATCHMO [25] imple- 
ments the EP Tableaux method with a depth-first strategy. The system described 
in [8,9] is based on a breadth-first expansion of EP tableau. In these papers, nei- 
ther are EP tableaux formally introduced, nor are soundness and completeness 
properties established. 

The following issues deserve further investigations. 

1 An analysis of EP tableaux and FINFIMO’s efficiency is needed. This issue is 
however a difficult one because there are not much systems that are complete for 
both, finite satisfiability and unsatisfiability, and no benchmarks are available 
that are fully relevant for finite satisfiability verification. Note that the extended 
i5 (or S*) rule sometimes cuts down infinite search spaces and thus sometimes 
results in a gain in efficiency for refutation reasoning. In other cases, however, it 
expands a larger search space than the standard S rule [32,16]. In some cases, the 
EP tableau method expands isomorphic interpretations. It would be interesting 
to investigate, whether techniques for avoiding this, such as the least number 
heuristic mentioned in [37] can be integrated in the EP tableau method. 

2 For most applications it would be desirable to have typed variables. An ex- 
tension based on a simple type system and many-sorted logic seem sufficient for 
applications such as described in [8] . 

3 The method could be extended to languages with function symbols. Due to 
the existential quantifiers it is possible with EP formulas to express functions by 
relations. No theoretical extensions would be necessary for such an extension and 
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constraint reasoning techniques as in [1] seem applicable. Nevertheless, explicit 
function symbols might be more convenient for some applications. On the other 
hand, there are applications like e.g., that described in [8] that do not need 
function symbols at all. 

4 For applications such as diagnosis [4], natural language understanding [14], and 
the database issues mentioned in Section 2 [7,2, 8, 9], it would be preferable to 
have a method not only complete for finite satisfiability, but also which generates 
only minimal models, as investigated e.g. in [11] or in [26]. 
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Abstract. We examine an approach for demand-driven cooperative the- 
orem proving that is well-suited for saturation-based theorem provers. 
We briefly point out some problems arising from the use of common 
success-driven cooperation methods, and we propose the application of 
our approach of requirement-based cooperative theorem proving. This 
approach aims to allowing more orientation on current needs of provers in 
comparison with conventional cooperation concepts. We introduce an ab- 
stract framework for requirement-based cooperation and describe two in- 
stantiations of it: Requirement-based exchange of facts and sub-problem 
division and transfer via requests. Finally, we report on an experimental 
study conducted in the areas of superposition and unfailing completion. 



1 Introduction 

Automated deduction is a search problem that spans huge search spaces. In 
the past, many different calculi have hence been developed in order to cope 
with problems from automated theorem proving, e.g. the superposition calculus 
([BG94]) or certain kinds of tableau calculi. Furthermore, the general undecid- 
ability of problems connected with (automated) deduction entails an indeter- 
minism in the calculi that has to and can only be tackled with heuristics. Hence, 
usually a large number of calculi, each of them controllable via various heuristics, 
can be employed when tackling certain problems of theorem proving. 

When studying results of certain theorem proving competitions (e.g., [SS97]) 
it is recognizable that each calculus or heuristic has its specific strengths and 
weaknesses. As a matter of fact, for the most domains there is not only one 
strategy capable of proving all problems of the domain in an acceptable amount 
of time. Therefore, a topic that has recently come into the focus of research is 
the use of different strategies in parallel (see, e.g., [Ert92]). 

A better approach, however, is to employ cooperative theorem provers. The 
aim of cooperative theorem proving is to let several provers work in parallel and 
to exchange information between them. Thus, probably occurring synergetic 
effects should entail a further gain of efficiency. Some architectures are proposed 
for cooperative proving, e.g. in [Sut92,Den95,BH95,Bon96,FD97,Fuc98a,DF98]. 

Existing cooperation approaches are in main parts success- driven: One prover 
detects a certain information, e.g. a derived fact, that has been useful for it. Then, 
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this information is transferred to the receiver and integrated into its search state. 
One main problem regarding this cooperation technique is the lack of orientation 
on concrete needs or wishes of receiving provers. Hence, often useless information 
is exchanged. 

Therefore, the aim of this paper is to introduce a cooperation model for 
saturation-based provers that orients itself on concrete needs of theorem provers. 
The main idea of our approach of requirement-based theorem proving is only to 
send information as a respond to a request of the receiving prover that asks for 
certain kinds of information. Thus, we want to focus on some kind of demand- 
driven cooperation. We utilize requests so as to concentrate on needs of provers 
in two ways: Firstly, we point out a method for a requirement-based exchange 
of facts. Secondly, we will deal with methods to realize problem division and 
transfer via requests. As we will see, we introduce with the latter an analytic 
component into provers that do not necessarily work analytically by themselves. 

In the following, we introduce basics of automated deduction — in particular 
saturation-based theorem proving — in section 2. In section 3, we introduce a 
framework for requirement-based cooperative theorem proving and describe the 
behavior of our cooperative system. Sections 4 and 5 address concrete aspects of 
requirements, namely sub-problem transfer via requirements and requirement- 
based exchange of facts, respectively. After that, we underline the strength of 
our approach by first empirical studies in section 6. A discussion concludes the 
paper. 

2 Basics of Automated Deduction 

In general, automated theorem proving deals with following problem: Given a 
set of facts Ax (axioms), is a further fact Xc (goal) a logical consequence of 
the axioms? A fact may be a clause, equation, or a general first or higher-order 
formula. 

Provers based on saturation-based calculi go the way to continuously produc- 
ing logic consequences from Ax until a fact covering the goal appears (but also 
some saturation-based calculi use the goal in inferences) . Typically a saturation- 
based calculus contains several inference rules of an inference system X which can 
be applied to a set of facts (which represents a certain search state) . Expansion 
inference rules are able to generate new facts from known ones and add these 
facts to the search state. Contraction inference rules delete facts or replace facts 
by others. 

Usually, a theorem prover using a saturation-based calculus maintains a 
set of so-called potential or passive facts from which it selects and removes 
one fact A at a time. After the application of some contraction inference rules on 
A, it is put into the set of activated faets, or discarded if it was deleted by a 
contraction rule. Activated facts are, unlike potential facts, allowed to produce 
new facts via the application of expanding inference rules. The inferred new facts 
are put into E^ . Initially, E^ = 0 and E^ — Ax. The indeterministic selection 
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or activation step is realized by heuristic means. To this end, a heuristic H as- 
sociates a natural number H{X) G IN with each A G and that A G with 
the smallest 'H(A) is selected. 

We conducted experimental studies with the prover SPASS ([WGR96]) in 
the area of first-order logic with equality. SPASS is based on the superposi- 
tion calculus (see [BG94]). The expansion rules of SPASS contain the common 
rules of the superposition calculus, i.e. superposition left and right, equality res- 
olution, and equality factoring. The contraction rules contain well-known rules 
like subsumption and rewriting. Moreover, we conducted experiments with the 
equational prover Discount ([ADF95]) which is based on unfailing completion 
(see [HR87,BDP89]). In this context the axioms are always universally quanti- 
fied equations, the proof goal is an arbitrarily quantified equation. The inference 
system underlying unfailing completion is in main parts a restricted version of 
the superposition calculus. It contains one expansion inference rule that corre- 
sponds to the superposition rule. The contraction rules of unfailing completion 
are rewriting, subsumption, and tautology deletion. 

3 A Framework for Requirement-Based Cooperation 

In the following, we will discuss which kinds of requirements may be well-suited 
for cooperative theorem proving. After that, we describe how requirement-based 
cooperation can be organized. 



3.1 Architecture and abstract process model 

The basic idea of requirement-based theorem proving is to establish coopera- 
tion between several different saturation-based theorem provers by exchanging 
requests and responses to requests. Requests describe certain needs of theorem 
provers, responses to requests contain information of receivers of requests that 
may be well-suited in order to fulfill some needs formulated in the requests. We 
consider two different types of requests. 

Firstly, if it is possible to divide a proof problem into various (sub-) problems, 
a prover can require that some of the sub-problems should be solved by other 
provers. Hence, requests are used for sub-problem division and transfer. Secondly, 
it is possible to demand information of other provers that may be helpful for 
solving the actual proof task. The most profitable information a prover can obtain 
from others is a set of facts. 

The architecture of our system can be described as follows: On each proces- 
sor in a network of cooperating computers a saturation-based theorem prover 
conducts a search for a proof goal. We assume that all provers have correct infer- 
ence rules regarding a common logical consequence relation All provers start 
with a common original proof goal. Since it is possible, however, that provers 
divide problems into sub-problems it might be that in later steps of the proof 
run different provers have different (sub-)goals. Each prover is assigned a unique 
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number. We either let only different incarnations of the same prover cooperate — 
differing from each other only in the search-guiding heuristics they employ — and 
hence have a network of homogeneous provers, or we employ different provers 
{heterogeneous network). We assume that each prover is able to communicate 
requests and responses directly to each other prover. 

The working scheme of the provers is characterized by certain phases. While 
the provers tackle their problem independently during working phases P^, co- 
operation takes place during cooperation phases Pc- Working phases and co- 
operation phases alternate with each other. Thus, the sequence of phases is 

pO pO pi pi 

The activities during a cooperation phase can be divided into four activities 
of the following process model: 

1. Determination and transmission of requests to other provers. 

2. Transmission of responses to earlier requests of other provers. 

3. Receiving and processing foreign requests. 

4. Receiving and processing responses to own earlier requests. 

This process model does not allow for an immediate processing of incoming 
requests, i.e. it is not possible to receive a request, to process it, and to transmit 
a response in just one cooperation phase. Instead, the response must be trans- 
mitted in a later cooperation phase. Thus, the cooperation scheme is somewhat 
indexible but minimizes the amount of communication. 



3.2 Fact-represented requests and responses 

In order to make the process model more concrete we make our notions of request 
and response precise. In the following, and denote the sets of active and 
passive facts of a prover A. 

Definition 1 (request). 

A request from a prover A to a prover ,8 is a tuple req = {idreq, ^req, Sreq, treq)- 
idreq G IN is the number of the request, Xreq is a fact, Sreq is a predicate defined 
on (Xreq,P^,P§), and treq G IN a time index. 

The component idreq of a request should be — from the point of view of the 
sender of the request — a unique number which is needed in order to identify 
requests and responses (see below). The fact Xreq represents the request, the 
predicate Sreq is a satisfiability condition of the request req: If the predicate 
Sreq is true the request is completely processed and can hence be answered by 
the receiver. Now, we show for both kinds of requests, sub-problem transfer and 
requests that ask for facts, how they can be represented by Xreq and Sreq- 

Firstly, if we want to transfer a sub-goal by a request req we set Xreq = 
g. The satisfiability condition Sreq is defined by Sreq{Xreq, Pq ^ Pq) iff Xreq is 
proved by the receiver 8 to be a logic consequence of its initial axiomatization. 
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Secondly, if we ask for certain facts via a request, Xreq is a so-called schema 
fact. This schema fact describes in a certain way how facts look like that may 
be useful for the sender of a request. In our methods, schema facts Xreq are 
valid facts, i.e. logic consequences of Ax. The satisfiability condition Sreq holds 
if the receiver B has at least one fact A e U that corresponds to the 
schema given through Xreq. This is tested via a correspondence predicate C, i.e. 
in this case Sreq(Xreq, iTg ) iff 3A G F^ U F^ : C{Xreq, A). Different methods 
for requesting facts can be developed by using different methods for identifying 
schema facts and constructing correspondence predicates (see section 5). Note 
that this is the crucial point of our approach since the schema facts make the 
wishes of certain provers concrete and the correspondence predicates determine 
whether a prover is able to fulfill wishes and hence support another. 

The time index treq is the maximal number of working phases which are 
allowed to take place between the receipt of the request and the transmission of 
its response. The idea behind the use of such a time index is that the receiver of 
the request should not work mainly on the request but on its own proof attempt. 
Requests of other provers should be tackled besides the provers own activities. 
Since we do not want to put too much load on each prover through requests of 
others we restrict the processing of requests to a fixed duration. Note that we 
also use the time index treq for defining the predicate Sreq- We add the condition 
“time limit treq is exceeded” as a conjunctive condition to Sreq when dealing with 
requests for facts. Thus, we achieve that all facts are inserted into the response 
set that fulfill the correspondence predicate Sreq after the expiration of the time 
limit (see below). 

Responses are represented by facts, too: 

Definition 2 (response). 

A response of a prover A to a prover ,8 is a triple rsp = {idrsp, Brsp, Arsp). 
idrsp G IN is the number of the response, Brsp is a Boolean value, and Arsp a set 
of facts. 

The component idrsp of a response equals the number of the respective re- 
quest that is being answered. The Boolean value Brsp indicates whether or not 
the responder could process the request successfully (regarding the satisfiability 
condition Sreq) within the time limit given by treq. The response set Arsp is a set 
of facts which represents the answer to a request. If we respond to a request that 
transferred a sub- problem usually Arsp = 0- Ha prover responds to a request 
{idreq, Xreq, Sreq, treq) for facts, Sreq is based ou the Correspondence predicate 
C, and I is the set of axioms, Arsp contains maxrsp facts A with / |= A and 
C{Xreq, A). 

By employing these definitions we are able to outline the activities of the 
process model in more detail: 

The determination and transmission of requests is performed as follows: At 
first it is necessary to identify on the one hand sub-problems that should be 
tackled by other provers, on the other hand schemata of such facts that appear 
to be useful. After that, a unique idreq as well as a suitable time index treq 
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is assigned to each sub-problem or schema fact that should be sent to another 
prover via a request req. The next step is to insert each request into a queue Req 
of open requests of the sender and to transmit the request to other provers which 
are part of the network. More exactly, we transmit all requests that ask for facts 
to every prover which is part of the network, but transmit each sub-problem 
only to one other prover. 

In order to receive and respond to requests of other provers it is necessary to 
have also queues Req^ of open requests of other provers i. In order to respond 
to such requests, requests req e Req^ must be checked. If req is fulfilled a 
suitable response can be transmitted and req can be deleted from the queue 
Otherwise, it is necessary to check whether the time limit is exceeded. If 
this is true, the response {idreq, false, 0) must be communicated to the sender 
of the request and req must be deleted from the queue. 

If a prover receives a request req from another prover i, firstly req is inserted 
into . Secondly, the processing of the request is initiated (see sections 4 and 
5 for details). 

When receiving a response rsp to a request it is at first necessary to deter- 
mine the original request req e Req with the help of the idrsp component of 
the response. If the request has been processed successfully one can — if a sub- 
problem was transmitted by the request — consider the respective sub-problem to 
be solved or — if facts have been asked for — integrate the response set Arsp into 
the search state. If the request has not been processed successfully the sender 
can use this information in future (see [Fuc98b]). Finally, req has to be deleted 
from Req. 



4 Sub-problem Transfer by Requirements 

In this section we present a method for transferring sub-problems via requests. 
We restrict ourselves to the area of first-order theorem proving and henceforth 
facts are first-order clauses. For literals we define by = I' , if I = ~^l' , 
and otherwise. For a clause C = {h , . . . , ?„}, ~C is the set of clauses 

= {{~?i}, . . . , If C is a clause V{C) denotes the set of different 

variables in C. 

In order to realize requirement-based cooperation by transferring sub-prob- 
lems we employ our abstract model. It is only needed to make three aspects more 
precise. Firstly, we have to answer the question how we can identify certain sub- 
problems. Secondly, we must introduce techniques for managing our different 
sub-problems. Finally, we have to develop a method well-suited for processing 
sub-problems of other provers. 



4.1 Identifying sub-problems 

We start with the identification of sub-problems. In the following, we assume 
that our proof problem is given as a set A4 of clauses whose inconsistency is 
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to be proved by deriving the empty clause □ from A4. This is not a restriction 
in comparison with our original notion of a proof problem because Ax ^ C 
iff Ax U is inconsistent. Henceforth, we will call the clauses from goal 
clauses. Now, consider this situation: 

Definition 3 (i-AND-partition). 

Let M — M' U {C}, M' be a set of clauses, C be a clause. We call {Pi)i<i<n 
an i-AND-partition of C regarding A4 iff 

- {Pi)i<i<n is a partition of C, i.e. Ui<i<„P* = C, P* n = 0 if t yf j 

— At is inconsistent iff Vi, 1 < t < n : At' U {Pi} is inconsistent. 

If the inconsistency of a set At should be proved and we have identified an 
i-AND-partition of a clause C G At, we have also divided our original proof 
problem into n sub-problems pi = “At' U {Pi} is inconsistent”. 

Such an approach for dividing a problem into sub-problems is viable because 
there is an easy method for identifying i-AND partitions of a clause: 

Theorem 4. ([Fuc98b]) Let M be a set of clauses, At = At' U {C} for a set 

of clauses At' and a clause C. Let (Pi)i<i<n be a partition of C and V{Pi) n 
V{Pj) = 0 for i j. Furthermore, let the sets of clauses A} (1 < t < n) be 
defined by Ni = At' U {{~?} : I G Pj,j < i,V{Pj) = 0}. Then it holds: 

L {Pi)i<i<n is an i-AND-partition of C regarding At. 

2. At is inconsistent Vi, 1 < t < n : A} U {Pi} is inconsistent. 

The theorem points out a method for creating new sub-problems: On each 
processing node — if we haue not already divided the problem into sub-problems — 
we check for each activated clause C which is a descendant of a goal clause 
whether it can be partitioned into (Pi)i<i<n, n > 2, A(Pi)nT(Py) = 0 for t yf j. 
If m is the number of cooperating provers we limit the value n by 1 < n < m. 
Then, each prover that is able to find such a partition of a clause C distributes 
n — 1 tasks p2, . . . , p« to other provers via requests (a prover obtains exactly one 
of these sub-problems) and tackles the remaining task pi. That is, it replaces 
the clause C with its sub-clause Pi. The tasks which are sent to other provers 
are stored in a list R= (p2, . . . , pn). Note that the prover must possibly tackle 
also these sub-problems. This is because there is no guarantee that the other 
provers can give positive answers to the requests within their time limits. 

A theorem prover that has divided the problem can work more efficiently. 
This is because it works with smaller clauses. Further, it is possible to obtain 
interesting lemmas from other provers if they give positive responses to requests. 



4.2 Managing and processing sub-problems 

The main problem caused by this kind of problem division and transfer is that 
sender and receiver of a request have to work with clauses that are in general 
not logic consequences of the initial set of clauses. This is because a sub-clause 
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of a clause need not logically follow from the clause. Thus, we must develop 
mechanisms so as to work with such clauses. 

We start with the sender of a request and assume that it tackles the sub- 
problem pi = “Af' U {Pi} is inconsistent” by adding Pi to its search state. 

In order to work with a semantically invalid clause Pi during the inference 
process the sender introduces a tag for Pi and all descendants of Pi. It indi- 
cates that these clauses are no logic consequences of the initial clause set but 
descendants of semantically invalid clauses. Hence, we do not work any longer 
with clauses C but with clauses with tag (C, r). Either r=e or r = n€ IN. 
T = e denotes that the clause C is untagged, i.e. it is a logic consequence of the 
initial clauses. If r = n, n being the number of the sender, then the clause C is 
a descendant of Pi . In order to perform inferences we must replace the inference 
system X of each prover by an inference system X”. 



Definition 5 (Inference system X”). 

Let X be an inference system which works on sets of clauses. Then we construe 
the inference system X” working on sets of tagged clauses as follows: 



1. For each expanding inference rule 

{Ci,...,C„}h{Ci,...,C„,C};P(Ci,...,C„) 
in X, X” contains the rules 

{(Cl, e), . . . , (C„, e)} h {(Cl, e), . . . , (C„, e), (C, e)}; P(Ci, . . . , C„) 

and 



{(Cl, Ti), . . . , (C„, T„)} h {(Cl, Ti), . . . , (C„, T„), (C, r)}; P(Ci, . . . , C„)A 

3k G : {3i : Ti = k : Ti G {k, e} At = k) 
2. For each contracting inference rule 

{Cl, . . . , C„, C} h {Cl, . . . , C„, C'}; P(Ci, . . . , C„, C) 
in X (rules that delete clauses are transformed analogously), X” contains 
{(Cl, e), . . . , (C„, e), (C, r} h {(Cl, e), . . . , (C„, e), (C', r)}; P(Ci, . . . , C„, C) 
and 



{(Cl, Ti), . . . , (C„, T„), (C, r)} h {(Cl, Ti), . . . , (C„, T„), (C', t')} U P; 

P(Ci, . . . , C„, C)A3k e IN : (3i : Ti = kA'di : Ti G {k, e}Ar G {k, e}Ar' = k), 
V — tb ii T ^ e,V = (C, e) if T = e 



Hence, expansion inferences are performed in such a way that a clause C 
which is a result of an expanding inference with premises Ci, . . . , C„ is tagged, if 
some clauses Ci are tagged. Untagged clauses can contract every other clause and 
in that case the tag remains unchanged. If tagged clauses are able to contract an 
untagged clause it is necessary to store a copy of the (un-contracted) untagged 
clause. Otherwise, completeness may be lost if the processing of the request is 
finished and its offspring has been eliminated. Such copies are stored in a list 
T>j, j being the number of the prover. 

Now, if a prover is able to derive the empty clause □ the tag of the clause 
is checked. If the clause is untagged a proof of the original goal has been found. 
If it is tagged with the number of the prover, the current sub-problem has been 
solved. In the latter case the following activities take place. All clauses which 
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are tagged with the prover’s number j are deleted and the clauses D G T>j are 
integrated untagged into the search state. If the list of open sub-problems R= {) 
all sub-problems have been solved, i.e. also the original problem. Otherwise, a 
new sub-problem is chosen from R and processed as described. 

This modified inference scheme allows us also to process incoming responses 
easily. If a sub-problem pj has been solved by another prover it must only be 
eliminated from R. Moreover, if the request clause Pj was ground the clauses 
being elements of {{~?} : I G Pj} can be utilized as new lemmas in future. 

When working as a receiver of open sub-problems we proceed in a similar way. 
Each sub-goal received from another prover is tagged with the number of this 
prover and is added to the search state. Inferences between tagged and untagged 
clauses are performed with inference system . Hence, we forbid inferences 
between clauses having different tags in order to avoid inconsistency (see also 
[Fuc98b]). 

If an empty clause is derived by a prover and it is untagged or tagged with 
the prover’s number the activities are as described above. If it is tagged with 
the number of another prover i all clauses with this tag are deleted and the 
clauses from T>i are added to the search state. Furthermore, the sub-problem is 
considered to be solved, i.e. a positive response can be sent to the request sender 
in the next cooperation phase. 

5 Requirement-Based Exchange of Facts 

In this section, we present two different methods for exchanging facts via requests 
and responses. Note that we restrict ourselves again to the area of first-order 
theorem proving, i.e. facts correspond in the following to first-order clauses. We 
assume that all provers employ the superposition calculus and additional contrac- 
tion rules like subsumption and rewriting. Note that the introduced techniques 
can easily be transferred to other similar saturation-based calculi. 

The principle scheme of a requirement-based exchange of clauses is already 
known through our abstract model. However, it is necessary to describe two 
remaining aspects. Firstly, we must introduce methods for detecting request 
clauses (schema clauses) in each cooperation phase P^. Secondly, we have to 
deal with the issue of how such requests can be processed by the receivers, i.e. 
we have to make precise how to compute a response set Crsp of a response rsp 
to a request req. 



5.1 Expansion-based requests 

The basic idea of requests for clauses is that a theorem prover tries to get those 
clauses from other provers that appear to be part of a proof, but are not already 
derived and seem to be difficult to derive. The main problem in this context is 
that — because of the general undecidability of first-order theorem proving — it is 
impossible to predict whether or not a clause is part of a proof. However, often 




148 



Dirk Fuchs 



a prover is able to estimate whether some of its own already activated clauses 
possibly contribute to a proof (see below). Then, if we assume that a prover has 
identified a set A4 of “interesting” activated clauses, clauses other provers can 
be asked for are such clauses that allow for producing descendants with clauses 
from A4 . Perhaps some of this offspring can contribute to a proof. We call such 
requests expansion-based requests. 

In detail, in each cooperation phase a prover determines a set of request 
clauses Pi = {C\, . . . , C Each request clause should be untagged, 

i.e. a valid clause. As already mentioned, the clauses C* (1 < j < maXreq) 
should be the clauses of the prover that appear to be most likely to contribute 
to a proof, i.e. they should be optimal regarding a judgment function cp which 
rates the probability that a clause is part of a proof. 

We want to deal with the realization of p in some more detail. Since it is 
the aim of a prover to derive the empty clause a clause is considered to be the 
better the less literals it has. Thus, we could use the formula p(C) = We 
adopted and refined the method as follows. In addition to the length of a clause 
we take into account that clauses having literals with a rather “flat” syntactic 
structure can often be used for expansion inference steps like resolution. Thus, 
considering the number of literals and the syntactic structure of the literals we 
obtain the following weighting function. 

Definition 6 (weighting function p for expansion-based requests). 

The weighting function p for expansion-based requests is defined on clauses by 

n 

p{C)=J2^L^t{k)■,C={h,...,ln} 

The function p^n is defined as follows: pLit{l) = if ^ is a positive 

literal, and pLit{l) = 0), ifl = -^V . The function p^^^ can be computed 

by 

{ l-\- d ;l is a, variable 

2 + d + E"=i d + 1) ;l = id(ti, tn), is a 

function or predicate symbol 

p judges a clause the better the less literals it has, and the less symbols and 
deep sub-terms each literal has. Thus, the function complies with our demands 
formulated above. 

The second main aspect — besides the determination of request clauses — is 
the processing of requests by their receivers. Essentially, we have to deal with 
the problem of computing a response set Crsp regarding a request req. The 
easiest method for determining a response set Crsp is to insert such valid clauses 
into Crsp that allow for expanding inferences with Creq- A disadvantage of this 
approach is that certain inferences must be performed twice. On the one hand 
it is necessary to perform expanding inferences with Creq at the receiver site 
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in order to determine clauses C being elements of Crsp- On the other hand, 
the receiver of the response set Crsp must perform exactly the same inferences 
with Creq when it integrates the clauses of Crsp into its search state. Thus, 
our refinement of this simple method is as follows. The main idea is to already 
perform inferences with Creq at the responder site and to transmit only such 
valid clauses to the sender of the request that are already descendants of Creq 
and some of the clauses of the responder. 

These descendants can be computed either independently of the “normal” 
inferences after the receipt of a request, or simultaneously to the inferences nec- 
essary to tackle the proof problem. We chose the latter approach by integrating 
Creq iuto the Search state of the receiver and tagging it with both the number of 
the sender of the request and the id of the request. Then, by extending the tag- 
ging mechanism of the preceding section descendants of Creq can be computed 
(see [Fuc98b]). 

5.2 Contraction-based requests 

There is also another concept for a requirement-based exchange of clauses. In- 
deed it is difficult to predict whether or not a clause contributes to a proof but 
nevertheless it is possible to recognize whether a clause is useful for the search 
for the proof. If a clause is able to contract many other clauses it is definitely 
useful for the search process because it helps to save both memory and com- 
putation effort. Thus, it is also interesting to require that other provers should 
send clauses that allow for a lot of contracting inferences. We call these requests 
contraction-based requests. 

Especially well-suited for reducing the amount of data and computation are 
clauses that subsume or rewrite clauses that tend to produce much offspring. 
Thus, the set of clauses M = {C ■. C is & valid, active, and positive unit, C 
is among the maXreq largest generators of clauses} is determined as a set of 
request clauses in each cooperation phase. This set offers each receiver of the 
request clauses the possibility of determining clauses which are able to subsume 
or rewrite them. These clauses are then especially useful for the search for the 
proof because they can contract clauses from At which cause much overhead. 
Note that we restrict ourselves to positive units mainly due to efficiency reasons. 
In order to determine a response set Crsp regarding a request with request clause 
Creq we insert on the one hand clauses into Crsp which are able to subsume 
Creq, ou the Other hand clauses which are able to rewrite Creq- Hence, we have 

Crsp — Crsp^sub U Crsp.rew- 

If contains all active clauses of the responder that are logic consequences 
of the initial set of clauses, the set Crsp, sub regarding a request clause Creq is 
simply given by 



Crsp, sub = {C:(C€ 3a : a(C) = Creq)} 

Determining a set Crsp,rew of clauses which are able to rewrite Creq is more 
complicated as before because we must consider the ordering each prover 
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uses for performing inferences. We restrict ourselves in the following again to 
response sets containing only positive equations. Since we cannot rewrite with 
the minimal side of an equation (regarding )^), we must at first identify the sides 
relevant for rewriting and transform clauses C G with following function 

Q to sets Q(C\. If the sender of the request and the responder have an identical 
ordering we utilize 



^ ^ tj S t 

({s, t} -,C = s = t, s )/- t,t )/- s 

Hence, we consider only the left hand side of a rewrite rule but both sides 
of an equation. Otherwise, if sender and receiver employ different orderings, we 
employ 



e{C) = {s,t}]C=s = t 



Then, it is necessary to check whether terms from 9{C) match to a sub-term 
of Creq- Such clauscs can be inserted into Crsp,rew and send via respond messages. 

In the following, 0{Creq) denotes the set of positions in the request clause 
Creq and Creq\p the sub-tcrm of Creq at position p. Then, we obtain: 

Crsp,rev. = {C : {C G .F^’",3(u,C" G 0(C), p G 0{Creq)) : <j{C') = Creq\p)} 

Note that these response sets can efficiently be computed since the sets of 
active clauses are usually rather small. 

6 Experimental Results 

In order to examine the potential of our cooperation concepts we conducted our 
experimental studies in the light of different domains (ROB, HEN, LCL, LDA) 
of the problem library TPTP (see [SSY94]). We restricted ourselves to the area 
of superposition-based theorem proving and coupled the provers SPASS and 
Discount. Our test set consisted of pure unit equality problems and of problems 
specified in full first-order logic. Thus, we can reveal that our cooperation concept 
achieves cooperation in an area where both provers are complete as well as in 
an area where one prover is only able to support the other but not to solve 
the original problem. Hence, we show that our concept is well-suited for provers 
having equal rights as well as for provers being in a master-slave relation. 

Since both calculi — superposition and unfailing completion — are complete 
for pure equational logic (EQ), SPASS and Discount can work as partners 
having equal rights for problems of EQ. Thus, we let each prover send requests 
and responses to requests to its counterpart. Because of the fact that \C\ = 
1 for all clauses C it is not possible to divide a problem into sub-problems. 
Thus, we must omit requests dealing with sub-problem transfer. Nevertheless, 
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expansion- and contraction-based requests for exchanging clauses can be utilized. 
We exchanged expansion-based requests and responses in the following manner. 
In each cooperation phase each prover determines maXreq = 10 request clauses 
to be distributed to the other prover. In order to respond to an expansion-based 
request we inserted maXrsp = 3 clauses into the respective response set Crsp- It is 
sensible to give the responder enough time for processing the request. Therefore, 
we set the time limit treq = 3. In order to exchange contraction-based requests we 
restricted the size of the set Ai of largest generators of clauses to maXreq = 10. 
The time limit treq for contraction-based requests was given by treq = 1- 

In the area of full first-order logic with equality (PLIEQ) Discount is not 
able to prove every valid goal because it can only deal with equations. Neverthe- 
less, SPASS and Discount can work in some kind of master-slave relation. Be- 
cause of the fact that Discount cannot prove every valid goal we decided to let 
only SPASS send expansion-based requests for clauses. We extended Discount 
so as to allow it to perform superposition with its equations and clauses received 
from SPASS. Contraction-based requests were exchanged by both provers. We 
have chosen the same parameter setting as in the area of unit equality. Because of 
the fact that in first-order logic with equality a clause can have a length greater 
than 1 we can transfer sub-problems from SPASS to Discount. 

In order to allow for a better comparison of our different concepts for sending 
requests for clauses, we either exchanged only expansion-based requests and 
responses to the requests or contraction-based requests and responses. Requests 
that transferred sub-problems were — considering the above restrictions — always 
exchanged. 

In all of our test domains we only considered problems that none of the 
provers could solve within 10 seconds (medium and hard problems). Moreover, 
we restricted ourselves to problems with enough unit equations such that the 
completion of DISCOUNT did not stop. For all examined problems we could 
observe that one of the two variants of our cooperative system was either bet- 
ter than each of the coupled provers — the runtime was less or the cooperative 
provers could solve a problem none of the coupled provers could solve when 
working alone — or we achieved the same result, that is, neither the cooperative 
system nor one of the coupled provers could cope with the problem. For illus- 
tration purposes we present a small representative excerpt of these experiments 
in table 1 (enriched with some problems taken from other domains). In table 
1, the entry denotes that the problem could not be solved within 1000 sec- 
onds (all runtimes were achieved on SPARCstations 20/712). Column 4 shows 
whether the problem is specified in pure equational logic (FQ) or in first-order 
logic with equality (PLIFQ). Column 5 displays the run time when employing 
requests for sub-problem transfer and expansion-based requests for clauses, col- 
umn 6 the respective time when exchanging requests for sub-problem transfer 
and contraction-based requests for clauses. The last column 7 presents which 
prover could solve the problem in the cooperating runs. 

For all problems we can find at most one cooperation method that allows 
for a gain of efficiency. Furthermore, sometimes it is even possible to solve prob- 
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problem 


SPASS 


Discount 


EQ/PLIEQ 


expans. 


contr. 


proved by 


B00007-4 


403.4 


- 


EQ 


330.7 


144.4 


Discount 


GRP177-2 


- 


- 


EQ 


- 


123.8 


Discount 


GRP179-1 


- 


- 


EQ 


447.0 


63.5 


Discount 


LCL163-1 


10.0 


12.0 


EQ 


8.2 


6.2 


Discount 


R0B005-1 


- 


109.6 


EQ 


36.6 


60.6 


SPASS 


R0B008-1 


- 


98.8 


EQ 


13.5 


85.9 


SPASS 


R0B022-1 


15.1 


- 


EQ 


2.3 


3.9 


SPASS 


R0B023-1 


204.6 


- 


EQ 


47.3 


44.6 


SPASS 


CIVOOl-1 


24.9 


- 


PLIEQ 


13.0 


25.3 


SPASS 


LDAOll-2 


35.1 


- 


PLIEQ 


40.2 


30.4 


SPASS 


ROBOll-1 


105.3 


- 


PLIEQ 


110.7 


54.9 


SPASS 


R0B016-1 


9.8 


- 


PLIEQ 


4.3 


5.8 


SPASS 


HEN009-5 


309.9 


- 


PLIEQ 


370.8 


233.9 


SPASS 


HENOlO-5 


68.7 


- 


PLIEQ 


62.9 


70.3 


SPASS 


HENOll-5 


41.2 


- 


PLIEQ 


29.3 


20.1 


SPASS 


LCL143-1 


16.1 


- 


PLIEQ 


12.4 


11.3 


SPASS 



Table 1. Coupling SPASS and Discount by exchanging requests and responses 



lems through cooperation that are out of reach for both of the coupled provers. 
If we compare the results achieved by expansion-based requests with those of 
contraction-based requests we can see that contraction-based requests are mostly 
the better alternative. For a deeper analysis we refer to [Fuc98b]. 



7 Discussion and Future Work 

We have presented the approach of requirement-based cooperative theorem prov- 
ing. This approach realizes some kind of demand-driven cooperation of satura- 
tion-based provers. Thus, it is possible to incorporate an orientation on the 
concrete needs of theorem provers into the cooperation scheme. We described 
an abstract framework for requirements and particularly two certain aspects of 
requirement-based cooperation. On the one hand requirement-based exchange 
of facts, on the other hand sub-problem division and transfer via requests. 

Related approaches for an exchange of information between theorem provers 
are mainly success-driven (e.g., [Sut92], [Den95], [BH95], [Fuc98a]). In contrast 
to our approach, in these methods information is sent to other provers without 
considering specific needs of the receivers. 

In future, it would be interesting to integrate also analytic provers, e.g. 
tableau-style provers, into our cooperative system. Since these provers are based 
on a division of the original problem into sub-problems especially sub-problem 
transfer via requests might be promising. Then, analytic provers can be used for 
identifying and transferring sub-problems, saturation-based provers for solving 
or simplifying them. Thus, requirement-based theorem proving offers the possi- 
bility to integrate both top-down and bottom-up theorem proving approaches. 
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Abstract. The inference rule 13 -resolution for regular multiple- valued 
logics is developed. One advantage of U-resolution is that linear, reg- 
ular proofs are possible. That is, unlike existing deduction techniques, 
U-resolution admits input deductions (for Horn sets) while maintaining 
regular signs. More importantly, U-resolution proofs are at least as short 
as proofs for definite clauses generated by the standard inference tech- 
niques — annotated resolution and reduction — and pruning of the search 
space occurs automatically. 



1 Introduction 

Signed logics [18,10] provide a general^ framework for reasoning about multiple- 
valued logics (MVL’s). They evolved from a variety of work on non-standard 
computational logics, including [2,3,5,6,8,15,14,16,20,22]. The key is the attach- 
ment of signs — subsets of the set of truth values — to formulas in the MVL. This 
approach is appealing because it facilitates the utilization of classical techniques 
for the analysis of non-standard logics, which reflects the essentially classical 
nature of human reasoning. That is, regardless of the domain of truth values 
associated with a logic, at the meta-level, humans interpret statements about 
the logic to be either true or false. 

This paper focuses on the class of regular signed logics. Regular signed logics 
are of interest in the knowledge representation and logic programming commu- 
nities because they correspond to the class of paraconsistent logics known as 
annotated logics, introduced by Subrahmanian [21], Blair and Subrahmanian 
[2], and Kifer et al. [12,13,23]. In [18], regular signed logics were also shown to 

* This research was supported in part by the National Science Foundation under grants 
CCR-9731893, CCR-9404338 and CCR-9504349. 

^ Hahnle, R. and Escalada-Imaz, G. [11] have an excellent survey encompassing de- 
ductive techniques for a wide class of MVL’s, including (properly) signed logics. 



J. Dix, L. Farinas del Cerro, and U. Furbach (Eds.): JELIA’98, LNAI 1489, pp. 154-168, 1998. 
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capture fuzzy logics, but in this paper, regular signed logics will refer to anno- 
tated logics. In particular, the focus is on the definite Horn subset of annotated 
logic, widely applied within logic programming. The inference rule U -resolution 
is developed for regular signed logics, and its relative advantages with respect to 
the standard inference rules for annotated logic programs (ALP’s) — annotated 
resolution and reduction — are described. These include the fact that linear, reg- 
ular proofs are possible; furthermore, 15-resolution proofs are at least as short as 
annotated resolution proofs, and pruning of the search space occurs automati- 
cally. 

The next section is a summary of the basic ideas of signed formulas; greater 
detail can be found in [20] and in [18]. 



2 Signed Logics 

We assume a language A consisting of (finite) logical formulas built in the usual 
way from a set A of atoms (predicates and terms at the first order level), a set 
of connectives, and a set of logical constants. For the sake of completeness, we 
define a formula in A as follows: Atoms are formulas; if 0 is an n-ary connective 
and if T\,T 2 , ■ ■ ■, Tn are formulas, then so is 0(T\^ J- 2 , . . . , An)- 

Associated with A is a set A of truth values, and an interpretation for A 
is a function from A to A; i.e., an assignment of truth values to every atom 
in A. A connective 0 of arity n denotes a function 0 : Z\" — ^ A. Interpreta- 
tions are extended in the usual way to mappings from the set of formulas to A. 
Alternatively, a formula T oi A can be regarded as denoting a mapping from 
interpretations to A. 

A sign is a subset of A, and a signed formula is an expression of the form 
S : if, where S' is a sign and iT is a formula in A. If iT is an atom in A, we call 
S\T& signed literal. 

Signed formulas may be thought of as a formalization of meta-reasoning over 
MVL’s [20]. A natural interpretation of the signed formula S : if is the query, 
“Is the truth value of T in S?” The answer to such a query is yes or no; that is, 
either the formula evaluates to some element in S or does not. Observe that both 
the query and the answer are at the meta-level; observe also that the question 
cannot even be formulated at the object level. On the other hand, the question, 
“What is the truth value of iT?” may be interpreted at the object level since the 
answer is an element of A. 

For example, let A be the interval [0, 1], where elements of A represent the 
degree of belief of some fixed reasoning agent X. Thus, {1} : P can be interpreted 
as, “Is X certain of the proposition P?” and [0,.I]:P asks, “Is X quite doubtful 
of P?” These are yes or no questions. 

To answer arbitrary queries, we represent queries about formulas in A by 
formulas in a classical logic Ag, the language of signed formulas; it is defined 
as follows: The literals are signed formulas and the connectives are (classical) 
conjunction and disjunction. It should be emphasized that a signed formula 
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S' : JE is a literal in As regardless of the size or complexity of T and thus has no 
component parts in the language As- The set of truth values is {true, false}. 

An arbitrary interpretation for As may make an assignment of true or false 
to any signed formula (i.e., to any literal) in the usual way. Our goal is to focus 
attention only on those interpretations that relate to the sign in a signed for- 
mula. To accomplish this we restrict attention to A- consistent interpretations. 
An interpretation I over A assigns to each literal, and therefore to each formula 
J-, a truth value in A, and the corresponding A-consistent interpretation Ic is 
defined by Ic(S:T) = true if I{T) G S; Ic{S:T) = false if I{T) ^ S. Note that 
this correspondence between the set of all interpretations over A and the set of 
A-consistent interpretations over As is 1-to-l. Intuitively, A-consistent means an 
assignment of true to all signed formulas whose signs are simultaneously achiev- 
able via some interpretation over the original language. Restricting attention to 
A-consistent interpretations yields a new consequence relation: If !Fi and are 
formulas in As, we write !Fi \=a if whenever Ic is a A-consistent interpreta- 
tion and Ic{tFi) = true, then Ic{tF2) = true. Two formulas !Fi and T2 in As are 
A- equivalent a Ic{tFi) = Icii^ 2 ) for any A-consistent interpretation Ic', we write 
iFi =A tp 2 . The following lemma is immediate. 

Lemma 1. Let Ic be a A-consistent interpretation, let A be an atom and T a 
formula in A, and let Si and S2 be signs. Then: 

1 . = false-, 

2 . Ic{A'.T) = true-, 

3 . Si C S2 if and only if Si.T S2-.tF for all formulas T-, 

4. There is exactly one 5 G A such that Ic({<^}:A) = true. □ 

The usual results involving Robinson’s Unification Theorem are unaffected 
by this development, and techniques at the ground level can generally be lifted 
to the general level. As a result, attention is mostly restricted to the ground case 
for the remainder of the paper. 



2.1 Signed Inference 

In this section, we describe a method for adapting resolution to produce an infer- 
ence rule for A 5 . Similar notions have been developed by Baaz and Fermiiller [1] 
and by Hahnle [9]. Many classical inference rules begin with links (complemen- 
tary pairs of literals) . Such rules typically deal only with formulas in which all 
negations are at the atomic level. Similarly, the inference techniques described 
below require that signs be at the “atomic level.” To that end, a formula in As 
is defined to be A- atomic if whenever S': A is a literal in the formula, then A is 
an atom in A. Often — with annotated logic formulas, for example — all formulas 
are assumed to be A-atomic. 

The inference rule fj-resolution is based on the notion of complementary 
literals, which is generalized in the next lemma. The lemma is immediate from 
Part 4 of Lemma 1. 
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Lemma 2. (The Reduction Lemma) Let S'!: A and S 2 -A be A-atomic atoms in 
As; then Si:AaS 2 'A =a (Si n S' 2 ) : R and Si : A V S 2 '■ A =a (Si U S' 2 ) : A. □ 

Consider a rl-atomic formula T in As in conjunctive normal form (CNF). 
Let Cj,l < j < r, be clauses in T that contain, respectively, A-atomic literals 
{Sj:A}. Thus we may write Cj = Kj V {Sj:A}. Then the resolvent R of the Cj’s 
is defined to be the clause 

(v.jv((n4..) 

The rightmost disjunct is called the residue of the resolution; observe that it is 
unsatisfiable if its sign is empty and satisfiable if it is not. 

In this generality, this definition must be augmented with the following ob- 
vious simplification rules that stem from the Reduction Lemma. First, if the 
residue is unsatisfiable it may simply be deleted from R. Secondly, whenever R 
contains literals Si'.B, 1 < f < /c, we merge them into the literal IJfci Si'.B; if 
Ufci then i? is a tautology and may be deleted from R. Such merging 

is essentially a generalization to MVL’s of ordinary classical ground merging. In 
this paper, we are concerned with regular signs, and we hence forward restrict 
merging to identical literals (with identical signs). 

The classical notion of subsumption also generalizes to As'. Clause C sub- 
sumes clause D if, for every literal S:Ag C, there is a literal S' : A G D such 
that S C S' . Observe that if S' C S', and if two clauses are resolved on the 
literals S:A and S' :A, then the residue will be S:R (after all, S O S' = S), so 
the clause containing S:R must subsume the resolvent. This proves 

Lemma 3. The resolvent produced by resolving on two literals in which the 
sign of one contains the sign of the other is superfluous in the sense that the 
resolvent is necessarily subsumed by one of its parents. □ 

2.2 Regular Signed Formulas and Annotated Logics 

Assume now that the set of truth values A is not simply an unordered set 
of objects but instead forms a complete lattice under some ordering The 
greatest and least elements of A are denoted T and T, respectively, and Sup 
and Inf denote, respectively, the supremum (least upper bound) and infimum 
(greatest lower bound) of a subset of A. 

Let (P; :<) be any partially ordered set, and let Q 'A P. Then IQ = {y G 
P\(3x G Q) X ^y}. Note that tQ is the smallest upset containing Q (see [4]). If 
Q is a singleton set {x}, then we simply write '[x. We say that a subset Q of P is 
regular if for some x G P, Q = '[x or Q = (^x)' (the set complement of ta^)- We 
call X the defining element of the set. In the former case, we call Q positive, and 
in the latter negative. Observe that both A and 0 are regular since A — 'll. and 
0 = A'. Observe also that if ^ = Sup{x, y}, then fa; n ty = '[z. A signed formula 
is regular if every sign that occurs in it is regular. By Part 1 of Lemma 1, we 
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may assume that no regular signed formulas have any signs of the form (ta;)', 
where a: = _L, since in that case (ta^)' = 0 . 

An annotated logic is a signed logic in which only regular signs are allowed. 

2.3 Some Remarks on Notation 

A regular sign is completely characterized by its defining element, say x, and 
its polarity (whether it is positive or negative). A regular signed atom may be 
written ]x ■. A, while the complement is the set (ta;)' : A. Observe that (ta^)^ : 
A — ~ ('fa;:A); that is, the signed atoms are complementary with respect to 

A-consistent interpretations. With annotated logics, the most common notation 
J- '.X and ^ T\x. There is no particular advantage of one or the other, and it 
is perhaps unfortunate that both have arisen. We will follow the x : T convention 
when dealing with signed logics and use T : x for annotated logics.^ 

Let us also remark here that, historically, annotated logics have been re- 
stricted to A-atomic formulas. Though this restriction is unnecessary, it will be 
obeyed for the remainder of the paper to simplify the presentation. Though 
formuals are restricted to CNF, they are not restricted to be Horn. 

2.4 Signed Resolution for Annotated Logics 

A sound and complete resolution proof procedure was defined for clausal anno- 
tated logics in [ 15 ]. The procedure contains two inference rules that we will refer 
to as annotated resolution and reduction.^ These two inference rules correspond 
to disjoint instances of signed resolution. Two annotated literals Li and L2 are 
said to be complementary if they have the respective forms A : p, and ~ (A : p), 
where p> p, and annotated resolution is defined as follows: Given the annotated 
clauses (Li V Di) and (L2 V D2), where Li and L2 are complementary, then 
the annotated resolvent of the two clauses on the annotated literals L\ and L2 
is Di V I?2- 

Two clauses can be so resolved only if the annotation of the positive anno- 
tated literal that is resolved upon is greater than or equal to the annotation of 
the negative literal resolved upon. In that case the two clauses are said to be 
resolvable on the annotated literals Li and ^2- 

The reduction rule is defined when two occurrences of an atom have positive 
signs. Suppose (A : piVEi) and (A : P2VA2) are two annotated clauses in which pi 
and p2 are incomparable. Then the annotated clause {A \ Sup{p\, P2DVA1VA2 is 
called a reductant of the two clauses, and we say that the two clauses are reducible 
on the annotated literals A : and A: p2- 

It is straightforward to see that the two inference rules are both captured 
by signed resolution. In particular, annotated resolution corresponds to an ap- 
plication of signed resolution (to regular signed clauses) in which the signs of 

^ The reader can decide whether this will make both communities happy or unhappy. 
® Kifer and Lozinskii refer to their first inference rule simply as resolution. However, 
since we are working with several resolution rules in this paper, appropriate adjectives 
will be used to avoid ambiguity. 
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the selected literals are disjoint. Reduction on the other hand, corresponds to 
an application of signed resolution in which the signs of the selected literals are 
both positive. The next theorem is now immediate. Implicit in the term signed 
deduction in the theorem and in the remainder of the paper is deduction with 
signed resolution. 

Theorem 4. Suppose that iT is a set of annotated clauses and that 2? is a 
deduction of T using annotated resolution and reduction. Then I? is a signed 
deduction of T. In particular, if T is an unsatisfiable set of first order annotated 
clauses, then there is a signed refutation of T. □ 

Signed resolution thus provides a general method for implementing anno- 
tated logics, but, as is often the case in theorem proving, there is a trade-off 
between generality and efficiency. Since annotated resolution and reduction are 
instances of signed resolution, and since there exist signed resolutions for which 
no corresponding annotation resolutions or reductions exist, the search space in- 
duced by signed resolution will typically be larger than the search space induced 
by annotated resolution and reduction. The following theorem, proved in [18], 
characterizes that class of signed deductions that corresponds to the deductions 
obtainable via annotated resolution and reduction. 

Theorem 5. Suppose S\,...,Sn are regular signs whose intersection is empty, 
and suppose that no proper subset of {S'!, . . . , S'„} has an empty intersection. 
Then exactly one sign is negative; i.e., for some j, 1 < j < n, Sj = and 

for i ^ j, Si = ]Xi, where Xi , . . . , G Z\. □ 

The intersection of a positive regular sign and a negative regular sign is regular 
if and only if it is empty, and two negative signs can have a regular intersection if 
and only if one is a subset of the other. In view of Lemma 3, the latter situation 
need never be considered, so we define a signed deduction to be regular if every 
sign that appears in the deduction is regular and if no residue sign is produced 
by the intersection of two negative signs. Note that this implies that merging of 
literals is allowed only when regular signs are produced. The next two theorems 
are immediate. Theorem 7 states that the class of regular signed deductions 
is precisely the class of deductions using annotated resolution and reduction. 
As a result, restricting signed resolution to regular clauses captures annotated 
resolution and reduction without increasing the search space. 

Theorem 6. A signed deduction of a regular formula is regular if and only if the 
sign of every consistent residue is produced by the intersection of two positive 
regular signs. □ 

Theorem 7. Let 2? be a sequence of annotated clauses. Then T> is an annotated 
deduction if and only if 2? is a regular signed deduction. □ 

It follows from the theorem that regular signed resolution is complete. 
Corollary. Suppose T is an unsatisfiable set of regular signed clauses. Then 
there is a regular signed deduction of the empty clause from T . □ 
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2.5 Linear Signed Resolution and Horn Sets 

Logic programming^ may be thought of as providing a procedural interpretation 
of Horn clauses through linear resolution. Similarly, annotated Horn clauses may 
be interpreted procedurally through linear signed resolution. In light of Theo- 
rem 6 and the requirement that each step in a linear resolution involves the most 
recent resolvent, linear signed resolution may not in general be regular. Anno- 
tated logics, on the other hand, require regularity, which is incompatible with 
the linear restriction. This creates a difficulty if the goal is to employ annotated 
logics in a logic programming setting. This is illustrated by the next example. 




Fig. 1. The Complete Lattice FOUR. 



Consider the annotated logic program (ALP) P = {p : t <— ;p : f <— },® 
written over the lattice FOUR pictured in Figure 1 (see [2]). Consider now the 
query, ^ p:T. It is easy to see V ^ p:T. However, linear annotated resolution 
alone does not admit a refutation since the reductant p:T <— must be computed 
from the two program clauses. This clause would resolve with the query to yield 
a refutation, but this clause cannot be inferred with the linear restriction because 
the goal clause must be used in every step. 

Kifer and Subrahmanian circumvent this difficulty by specifying that a de- 
duction consist only of applications of annotated resolution. However, any in- 
ference may involve a resolution with an annotated clause obtained by implicit 
applications of reduction. In the last example, a proof could be obtained by a 
single deduction between the original query <— p:T and the reductant p:T . 

Unfortunately, implicit use of reduction creates new problems. For example, 
a proof that consists of only annotated resolution steps may contain steps that 
include clauses not in the original program. This makes the proof difficult to 
read. Moreover, application of the reduction rule can be expensive since it can 
occur at any time during a deduction, significantly expanding the search space. 

Signed resolution, on the other hand, consists of a single rule of inference, 
and is therefore amenable to a linear restriction, thus making an SLD-like proce- 
dure implementable. However, the proofs may not be regular. Equivalently, any 
irregular resolvent that arises cannot be annotated with a simple annotation. 
Hence directly applying a linear restriction would require an extension of the 
syntax of annotated atoms. 

^ Several researchers have explored annotated logics with an interest in logic program- 
ming. A more detailed account can be found in [19]. 

® We use the left arrow to represent definite clauses in the standard way. 
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2.6 A Linear Procedure with Annotations 

There is a means of implementing a linear strategy for signed resolution using 
annotations since, in view of Theorem 5, it is never necessary to resolve on 
literals with two negative signs. One idea is to associate with each active goal 
two annotations, a negative one, a, and a positive one, (i. The negative one is 
always the goal’s initial annotation. The positive one, which changes during the 
deduction, is the supremum of the annotations of the positive literals against 
which the goal has been resolved. The two annotations a and fi represent the 
sign (ta)'n t/9- More precisely, if ^ G : a is the goal, then the initial positive 
annotation is T. At each stage, if the positive annotation is (3, and if the goal is 
resolved against a rule whose head is B:p, there are two cases to consider. First, if 
a < Sup{/3, p}, then the goal may be deleted. Otherwise, the positive annotation 
f] is replaced by Sup{/3, p}, and the goal remains active. This procedure is sound 
by the Reduction Lemma and is complete (See [18]). 

Although this technique does admit linearity, reductions are nevertheless 
implicitly performed; they are represented by the second annotation. Of course, 
to be linear, each of the methods we have discussed must either allow irregular 
signs or must capture reductions (implicitly or explicitly). Unfortunately, many 
redundant reduction steps are possible with these methods. As we shall see, 15- 
resolution can avoid many such redundancies and provide a cleaner input-style 
deduction in which no extra annotations are necessary; i.e., only strictly regular 
input deductions are required for ALP’s using 15-resolution. 

3 Decomposition and D-Resolution 

The issues discussed may be summarized as follows: 

— For regular signed logics — equivalently, annotated logics — signed resolution 
provides generality, while annotated resolution with reduction provides effi- 
ciency. 

— For definite Horn regular signed logics — annotated logic programs — signed 
resolution is amenable to the linear (and thus input) restriction, while an- 
notated resolution and reduction is not. Signed resolution with the linear 
restriction does not guarantee regularity. 

The problem is to reconcile the shortcomings of these two approaches. When 
the goal annotation p is not less than or equal to the program clause annota- 
tion p, then the intersection of (| p)' and t P is simply not regular and can- 
not be represented by an annotation. Signed resolution would produce the goal 
((t p)'C\ t p)-P- However, since this goal can be soundly deduced, so can any 
goal of the form S\p, where S U ((f p)'r\ | p)- Intuitively, 15-resolution works by 
determining an annotation po such that (t po)' is the smallest regular set that 
contains ((| p)'fi T p)- Thus we may soundly infer the new goal p \ po without 
losing regularity or linearity; completeness is not lost either. 

We begin the development of fj-resolution by introducing decomposition, an 
inference rule for annotated logics. Decomposition by itself does not solve the 
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problems alluded to above, but it leads naturally to 15-resolution Consider first 
the next example. 

Example. Let A be the lattice FOUR of Figure 1 and let Q be the query ^ p:T. 
Further, let P be the program 
Po p:t^q:t 
Pi p:f ^ q:f 
P 2 g : t V- 

P3 q-i^. 



It is easy to see that P \= p:T. Thus the query ^ p:T should be provable. Using 
the technique of carrying two annotations, we may arrive at a proof using signed 
resolution. The initial query is provable by first resolving it against p:t ^ q:t 
to produce the query ^ p : T*, g : t. Resolving next against p : f ^ g : f yields 
<— p : T^, g : t, g : f, which simplies to <— g : t, g : f . This query may now be proved 
through two simple resolution steps against P 2 and P 3 . 



3.1 Decomposition 

Suppose that, instead of maintaining two annotations, after determining that 
the initial query cannot resolve (using standard annotated resolution) with any 
program clause, the initial query is decomposed into the two goal query: ^ p : 
t,p:f. Then the two goals can be resolved through annotated resolution against 
the program clauses in a straightforward way. More generally, in view of the 
fact that proving p : T can be accomplished by proving both p : t and p : f, 
which are easier to prove, our goal is to set up a rule of inference that performs 
such a decomposition whenever a suitable clause cannot be found for annotated 
resolution. 

Definition. Let Q be the query ^ Ai : pi, ■ Hm, and suppose that Ai : pi 
and Ai : p 2 are literals such that pi < Sup{pi,p 2 }- Then Ai : pi is said to 
decompose to {Ai : pi, Ai: P 2 ), and the decomposition of Q with respect to Ai is 

^ Al . pi , . . . , Ai— 1 . pi— 1 , Ai . Pi , Ai . P2 , Ai^i . pi+l , . . . , Ajyi . pm • 



Theorem 8. Suppose Q is a ground annotated query. Let be a decomposi- 
tion of Q and P be a Ts-interpretation. If Ic{Q) = true, then IdQ’^) = true. 

□ 

Decomposition, together with annotated resolution, constitutes an SLD-style 
proof procedure, and it does so using only regular signed literals. By decomposing 
an annotation into two “lower annotations,” decomposition is a sort of divide- 
and-conquer strategy. In the previous example, it was straightforward to choose 
two annotations into which the annotation T associated with the query can be 
decomposed. Not surprisingly, it is not always this easy. 

Example. Let A be the lattice of Figure 2, and let P be the program 
Pi p:V^q:V 
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P 2 g : 1 ^ 

P 3 g:2^, 

where V is an arbitrary annotation. 

Suppose that Q is the query, ^ p : T. Clearly P ^ p : T. If we chose to de- 
compose p : T into the two literals p : 1 and p : 3, then the corresponding query 
<— p : l,p : 3 does not have a refutation using only annotated resolution. On the 
other hand, if p : T is decomposed into the two literals, p : 1 and p : 2, there is a 
refutation using annotated resolution. 

T 



2 





Fig. 2. 1-2-3 Lattice 



The example demonstrates how difficulties in choosing a decomposition may 
arise because the two annotations into which the original should be decomposed 
is dependent on the structure of A and on the program clauses. In the case of 
FOUR, the choice was immediate, regardless of the annotations of the program 
clauses. On the other hand, for the 1-2-3-lattice, the usefulness of the choice 
depended on what information was contained within the program. In general, 
decomposition may be expensive because the number of guesses may be large 
and because many of them may not lead to a proof. However, there may be 
properties embedded in certain classes of lattices that can be used to reduce the 
number of guesses. In the next section, we define one class of lattices — ordinary 
lattices — and introduce a simple and elegant SLD-style proof procedure based 
on fj-resolution, which is an enhancement to decomposition in which no guessing 
is required. 



3.2 Ordinary Lattices and ?J-Resolution 

Decomposition can be employed with any lattice, but its application may be 
expensive. One way to improve decomposition is to modify it to be a binary 
inference rule in which the selection of one component is based on the existence 
of an appropriate head literal with annotation p in a program clause. Based on p. 
and p, an annotation 7 may be guessed such that p ^ Supjy, p}. The inference 
rule 15-resolution, described below, exploits this observation by further restricting 
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the class of lattices to those with the property that for any two values p of the 
lattice, there is a best choice for 7 . We call such a collection of lattices ordinary 
lattices. First, given p,p&A, we use M{p, p) to denote the set of annotations 
for which the least upper bound of p with an element of the set is greater than 
or equal to p. Formally, M{p, p) = {7 G Z\, | Sup{ 7 , p} ^ p\. 

For the 1-2-3-lattice in Figure 2, if p = T and p = 1, p) = {2, 3, T}. 
For the lattice FOUR in Figure 1, if p = T, and p = t, M.{p, p) = {f, T}. 

Definition. Given p,p G A, the 15 operator^ is defined as follows: I3{p, p) = 
Inf(Al(p,p) ). A lattice A is said to be ordinary if Vp, p G A, U(p,p) G 
Ai{p,p) . To simplify the notation, given p, pi,...,pm, we write I3{p, pi, pm) 
forU(U(...U(U(p,pi),p 2 )...),p™). 

Notice that the lattice 1-2-3 is not an ordinary lattice since the set A4 (p, p) = 
{2, 3, T} has no least element in the set. The lattice FOUR, however, is an 
ordinary lattice. 

The next property of ordinary lattices is important for proving the complete- 
ness of 15-resolution. 

Lemma 9. Let A be an ordinary lattice. Then I3{p, pi, pm) = T iff p ^ p, 
where p = Sup{pi, ..., Pm}. n 

If A is an ordinary lattice, it is easy to see that, given a query literal p: p and a 
program clause head annotation p : p, the natural choice for the decomposition 
of p : p based on p : p is p : p and p:l3{p,p). The first part of the decomposition 
is obvious, and the second part represents the “simplest” remaining condition 
that must be solved in order to solve the original query. This leads to the desired 
inference rule. 

Definition. Given an ALP over an ordinary lattice, suppose Q is the query 

Al ■ pi j ■ ■ ■ j Am : pm 

and C is the program clause A: p^ Body, where Ai and A can be unified with 
mgu 9, and where I3{pi,p) pi. Then the 15-resolvent of Q and C with respect 
to Ai is the query 

^ (Ai . pi, ..., Ai—\ . pi—\, A . I3(^pi, p). Body, A^+i . p^+i, ..., Am . Pm}9. 

A 13 -deduction of a query from a given ALP and initial query is defined in the 
usual way. A 15-deduction of a query from an ALP is a 13 -proof if 

^ Ai : T, A 2 : T, ..., A„ : T 



is the last clause in the 15-deduction. 

Lemma 10. The inference rule 15-resolution is sound for ordinary lattices. □ 



The usual pronunciation of this symbol, which is an upside-down ft, is “mo” — a long 

O. 
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Fig- 3. Lattice with Defaults 



The next example demonstrates the use of 15-resolution over the ordinary 
lattice in Figure 3. Notice that the lattice consists of the values dt, df, and 
T, in addition to the four elements of FOUR. Ginsberg introduced this lattice 
to distinguish between truth and default facts [7]. The symbols dt and df may 
be regarded as “concluded true, respectively, false, by default.” Following this 
intuition, dT stands for “inconsistent default conclusion” and is distinguished 
from the stronger inconsistency represented by T. 

Example. Let A be the lattice of Figure 3, let P be the program 
Pi p : dt ^ 

P 2 p-.dt 

and let Q be the query ^ p : dT. Then a 15-proof proceeds as follows. 

Q* = Qo ■ ^ P '■ dT initial query; 

Qi : ^ p : df 15-resolvent of Pi and Qo; note: M{dT , dt) = {df, dT} 

Q 2 : ^ p : ± 15-resolvent of P 2 and Qi; note: At(df, df) = A 

The inference rule 15-resolution is a variant of the more general decomposi- 
tion; it exploits properties intrinsic to the structure of the ordinary lattices. As 
such, 15-resolution may be regarded as a semantically-based inference rule. One 
nice feature of 15-resolution is that it allows simple SLD-style proof procedures 
for annotated logic programs over ordinary lattices. This procedure eliminates 
the expensive reduction rule, yet it does not require irregular deductions. Prun- 
ing of the search occurs naturally in 15-resolution. For instance, consider the 
following program P over the lattice FOUR. 

Example. 

P = { p:f g : f , P : t P^, 

Oi C2 C3 C4 

Observe that P \= p:t. However, the query ^ p:t cannot 15-resolve with the 
head of C\, p:f, since I5(t,f) = t. Thus, C 2 and C 3 are acceptable candidates 
for 15-resolution with the query, but C\ is not. 
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In essence, C\ is not required for solving the query, and detection of such 
irrelevancies is built into 15-resolution. In the same example, other inference 
techniques will not prevent resolution of the query with Ci, or the use of Ci for 
reduction. This may represent considerable savings, since the body of Ci could 
be arbitrarily large and/or could lead to unnecessary deductions. Indeed, proof 
space reduction is a primary motivation for the introduction of fj-resolution. In 
the next section, we show that 15-resolution guarantees not only a “smaller” proof 
space, but one in which only the “best” proofs are admitted. The basic result is 
that, for each proof of a query using annotated resolution and reduction, there 
is a 15-proof of the same query using at most the same number of inferences. 
An example showing how search space pruning is built into 15-resolution is also 
described. 



3.3 Proof Space and Search Space Considerations 

In this section, deductions are viewed as sequences of inferences rather than 
as sequences of clauses. This is merely a matter of convenience. Deductions in 
standard annotated logic programming contain both annotated resolutions and 
reductions, and we wish to refer easily to the particular inference rule applied 
at a given step. 

It is impractical (and unnecessary) to deal with the space of all annotated 
resolution/reduction deductions. For one thing, given any deduction, a different 
deduction can be obtained merely by adding redundant steps. Thus we will 
consider deductions that are minimal in the sense that removal of any one step 
results in a sequence that does not formally comprise a deduction. Note that 
this in no way implies that such deductions are shortest in any sense. 

For another, the steps of a given deduction may be reordered in many ways. 
For example, the literals in the goal clause of an ALP computation may be solved 
in different orders. This amounts to reordering the annotated resolution steps 
of a deduction, and such a reordering produces a distinct deduction. However, 
a given annotated resolution step may require a reductant that is produced by 
a series of reduction steps. These reductions (treated as implicit in annotated 
logic programming) can be performed in any order and at any point prior to 
the resolution. Although different reduction orders will be treated as different 
deductions, we will assume that, in aggregate, all reductions necessary to enable a 
resolution occur immediately prior to that resolution step. It is obvious that any 
deduction, say T>' , not allowed under these assumptions is in fact an essentially 
redundant variation of one, say T>, that is allowed, and T>' is no shorter than T>. 

In sum, we assume that an annotated logic programming deduction T> con- 
sists of a sequence of annotated resolution steps: 5i, ^ 2 , . . . , Furthermore, if 
S is any one of those steps, we assume that S has the form TZi, 7?-2, . . . , Ti-k, 72., 
k > 0, where the 72i’s are reductions and 72 is annotated resolution. Let A:/r be 
the goal removed by step 72, and let A : po , A : pi , . . . , A: pk he the heads of 
the k + 1 program clauses used in S. We will show that corresponding to these 
k reductions followed by a resolution, there is a sequence of fc -I- 1 15-resolutions, 
the last of which produces the goal A : T. (Starting with the goal clause, both 
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deductions involve exactly the fc -1- 1 program clauses used for the k reductions. 
In both cases, the resulting goal list will be expanded to include all goals in 
the bodies of those program clauses. So the goal resolved away by annotated 
resolution is the only one of concern.) 

The series of k + 1 15-resolutions will, by definition yield the goal 
A : po, pk)- By the definition of annotated resolution, 

P- A Sup(po)Pi) •••) Pk)- But then by Lemma 9, I3{p, po, pk) = T, and 
we have proved: 

Theorem 11. Let P be an annotated logic program, let Q be a query, and let 
T^ATZ be a proof of Q from P by annotated resolution and reduction. Then there 
is a proof of Q from P by 15-resolution that is no longer than Patz- 

Corollary. Suppose T is an unsatisfiable set of regular signed Horn clauses. 
Then there is a 15-resolution input deduction of the empty clause from F . □ 



3.4 Characterizing Ordinary Lattices 

Theorem 11 says that 15-resolution works well for annotated logic programs when 
it applies. If there are few applications which use ordinary lattices, 75-resolution 
loses its significance. One large class of ordinary lattices is the collection of 
linear lattices. They are easily shown to be ordinary, and this class of lattices is 
important for applications of annotated logic to quantitative reasoning. However, 
when the truth domain is linear, no reductions are necessary, and 75-resolution 
reduces to annotated resolution. 

A richer class of truth domains is the collection of distributive lattices (de- 
fined below). A number of applications of annotated logics involve distributive 
lattices [17]. Finite distributive lattices are ordinary; whether this result gener- 
alizes to infinite distributive lattices remains unanswered. 

Definition. A lattice A is said to be distributive if it satisfies the distributive 
laws: 



(Va,/3,7 GZ\) Inf{a, (Sup{/3, 7})} = Sup{(Inf{a, /?}), (Inf{a,7})} 

and 

{\/a,(3,jeA) Sup{a, (Inf{/3, 7})} = Inf{(Sup{a, /?}), (Sup{a,7})} 

The lattice in Figure 3 is ordinary, and it can be shown that it is also dis- 
tributive. 

Theorem 12. Finite distributive lattices are ordinary. □ 
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Abstract. We present a matrix characterization of logical validity in the 
multiplicative fragment of linear logic with exponentials. In the process 
we elaborate a methodology for proving matrix characterizations correct 
and complete. Our characterization provides a foundation for matrix- 
based proof search procedures for M£CC as well as for procedures which 
translate machine-found proofs back into the usual sequent calculus. 



1 Introduction 

Linear logic [12] has become known as a very expressive formalism for reasoning 
about action and change. During its rather rapid development linear logic has 
found applications in logic programming [14,19], modeling concurrent computa- 
tion [11], planning [18], and other areas. Its expressiveness, however, results in 
a high complexity. Propositional linear logic is undecidable. The multiplicative 
fragment {MCC) is already AfP-complete [16]. The complexity of the multiplica- 
tive exponential fragment {JviSCC) is still unknown. Consequently, proof search 
in linear logic is difficult to automate. Girard’s sequent calculus [12], although 
covering all of linear logic, contains too many redundancies to be useful for ef- 
ficient proof search. Attempts to remove permutabilities from sequent proofs 
[1,10] and to add proof strategies [23] have provided significant improvements. 
But because of the use of sequent calculi some redundancies remain. Proof nets 
[7], on the other hand, can handle only a fragment of the logic. 

Matrix characterizations of logical validity, originally developed as founda- 
tion of the connection method for classical logic [2,3,5], avoid many kinds of 
redundancies contained in sequent calculi and yield a compact representation 
of the search space. They have been extended successfully to intuitionistic and 
modal logics [24] and serve as a basis for a uniform proof search method [20] and 
a method for translating matrix proofs back into sequent proofs [21,22] . Resource 
management similar to multiplicative linear logic is addressed by the linear con- 
nection method [4]. Fronhofer [8] gives a matrix characterization of M.LL that 
captures some aspects of weakening and contraction but does not appear to gen- 
eralize any further. In [15] we have developed a matrix characterization for M.CC 
and extended the uniform proof search and translation procedures accordingly. 

In this paper we present a matrix characterization for the full multiplicative 
exponential fragment including the constants 1 and T. This characterization 
uses Andreoli’s focusing principle [1] as one of its major design steps and does 
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not appear to share the limitations of the previous approaches. Our approach 
also includes a methodology for developing such a characterization. By introduc- 
ing a series of intermediate calculi the development of a matrix characterization 
for JviSCC becomes manageable. Each newly introduced calculus adds one com- 
pactification principle and is proven correct and complete with respect to the 
previous one. We expect that this methodology will generalize to further frag- 
ments of linear logic as well as to a wide spectrum of other non-classical logics. 

We first create a compact representation of Girard’s sequent calculus [12] 
by adopting Smullyan’s tableaux notation to M£CC (Section 2). By introducing 
the notion of multiplicities, i.e. an eager handling of contraction and a lazy 
handling of weakening, we arrive at a dyadic calculus S '2 which we then refine to 
a triadic calculus by removing redundancies which are due to permutabilities 
(Section 3). In Section 4 we develop a calculus Epos which operates on positions 
in a formula tree instead of on the subformulas themselves. In order to express 
the peculiarities of some connectives we insert special positions into the formula 
tree. Finally, in Section 5, we arrive at the matrix characterization, technically 
the most demanding but also the most compact of all calculi. Proofs are only 
sketched briefly. Details can be found in the first author’s technical report [17]. 

2 Multiplicative Exponential Linear Logic 

Linear logic [12] treats formulas like resources that disappear after their use 
unless explicitly marked as reusable. Technically, it can be seen as the outcome 
of removing the rules for contraction and weakening from the classical sequent 
calculus and re-introducing them in a controlled manner. Linear negation is 
involutive like classical negation. The two traditions for writing the sequent rule 
for conjunction result in two different conjunctions ® and & and two different 
disjunctions and 0. The constant true splits up into 1 and T and false 
into T and 0. The unary connectives ? and ! mark formulas for a controlled 
application of weakening and contraction. Quantifiers V and 3 are added as usual. 

Linear logic can be divided into the multiplicative, additive, and exponential 
fragment. While in the multiplicative fragment resources are used exactly once, 
resource sharing is enforced in the additive fragment. Exponentials mark for- 
mulas as reusable. All fragments exist on their own right and can be combined 
freely. The full power of linear logic comes from combining all of them. 

Throughout this article we will focus on multiplicative exponential linear 
logic {JviSCC), the combination of the multiplicative and exponential fragments, 
leaving the additive fragment and the quantifiers out of consideration. ®, ^ , 
— o , 1 , T, !, and ? are the connectives of JViCCC. Linear negation expresses the 
difference between resources that are to be used up and resources to be produced. 

means that the resource F must be produced. Having a resource 
means having F\ as well as F 2 . F\ — 0 F 2 allows the construction of F 2 from F\. 
Fy^ F 2 is equivalent to Fi"'" — 0 F 2 and to F 2 '^ —oFi. Having a resource 1 has 
no impact while nothing can be constructed when T is used up. A resource 
\F acts like a machine which produces any number of copies of F. During the 
construction of \F only such machines can be used. ? is the dual to !. 
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(A.+).(A,-> 

T, succi (a) , SUCC 2 (a) 

rTo 

T, succ\ (l') I, 

ly 

r, 1 / 



r 

r,uj 



T, succi (o) 



Yi,succi(f3) Y2 ,succ2{i3) 



, succi (tt) 



r 

T, 1/ 



Ti.Ta,/? 

Y,v,v 



Y,v 



Table 1. Sequent calculus for JviSCC in uniform notation 



The validity of a linear logic formula can be proven syntactically by using a 
sequent calculus. For multi-sets F and A of formulas F — > A is called a sequent. 
It can be understood as the specification of a transformation which constructs 
A from F. The formulas in F are connected implicitly by 0 while the formulas 
in A are connected implicitly by ^ . 

By adopting Smullyan’s uniform notation to M.ECC we receive a compact 
representation of sequent calculi, which simplifies proofs about their properties. 
A signed formula (p = {F, k) denotes an occurrence of F’ in Z\ or F . Depending 
on the label F and its polarity k G {-h, — }, a signed formula will receive a type 
a, f3, V, TT, o, T, uj, or lit according to the tables below. The functions succi and 
SUCC 2 return the major signed subformulas of a signed formula. Note that during 
the decomposition of a formula the polarity switches only for and ^ . We 
use type symbols as meta-variables for signed formulas of the respective type, 
e.g. a stands for a signed formula of type a. 






OL 


, — ) 


(Fl^ F2,+) 


{Fi — o >2 , +) 


succ\ ( q :) 
SUCC2 (ot) 


{Fi,-) 

{F2,-) 


{Fi,+) 

(F2,+) 


{FuF 

{F2,+) 


P 


(Fi0F2, +) 


{Fi^ F2,~) 


{Fi — 0 F 2 , —} 


succi {fi) 
SUCC2 (/3) 


(^'2,+> 


{Fi,~) 

{F2,~) 


{Fi,+) 

{F2,A 



0 






SUCCi (0) 


(U+) 


(U-) 


ly 




C^F, +> 


succi {ly) 


(U-) 


(U+) 


TT 


(tF, A 


(F,+) 


SUCCi (tt) 


(U-) 


(U+) 



A sequent calculus 17^ based on this uniform notation is depicted in table 1. 
In a rule the sequents above the line are the premises and the one below the 
conclusion. A principal formula is a formula that occurs in the conclusion but 
not in any premise. Formulas that occur in a premise but not in the conclusion 
are called active. All other formulas compose the context. 17^ is shown correct 
and complete wrt. Girard’s original sequent calculus [12] by a straightforward 
induction over the structure of proofs. 

In analytic proof search one starts with the sequent to be proven and reduces 
it by application of rules until the axiom-rule or the r-rule can be applied. There 
are several choice points within this process. First, a principal formula must be 
chosen. Unless the principal formula has type ly, this choice determines which 
rule must be applied. If a /3-rule is applied the context of the sequent must 
be partitioned onto the premises {context splitting). Several solutions have been 
proposed in order to optimize these choices [1,10,23,6,13]. Additional difficulties 
arise from the rules axiom, r, and tt. The rules axiom and r require an empty 
context which expresses that all formulas must be used up in a proof. The tt rule 
requires that all formulas in the context are of type v. Though the connectives 
of linear logic make proof search more difficult they also give rise to new possi- 
bilities. Some applications for linear logic programming are illustrated in [19]. 
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O'OOO'I O-OIOO 
aoOOi OolO 
aoOOi l-'Ol 



axiom 
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aooio. 

TTOOI 5 ^o\ 



/^OO. ^01 ^ 
/^OO, ^01 
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(A&A)^ ?(.4^) 
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A^\A 


/3oo 
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OOlO, Oyio 
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Q'OlOO , nn 



Fig. 1. Example 17^-proof. 



Example 1. Figure 1 presents a 17^-proof of 99 = ((A®!/!)^ ?(7l-*-),+). We ab- 
breviate subformulas p' of (p by position markers as shown in the table on the 
right. The proof requires that the contraction rule c is applied before the /3-rule. 



3 N-adic Sequent Calculi for AA.SCC 

In this section we define two intermediate sequent calculi which are closely re- 
lated to Andreoli’s dyadic calculus E 2 and triadic calculus A3 [1] but differ in the 
way structural rules are handled. While Andreoli uses a lazy strategy for both 
contraction and weakening, our calculi S 2 and A 3 , which are not intended for 
proof search, are based on an eager strategy for contraction. Eager contraction 
corresponds to the concept of multiplicities in matrix characterizations [24] . 



3.1 Dyadic Calculus 



0 ■. {A,+),{A,~) 0 -.T 

O : T, succi{(x), succ 2 {oi) 

a 

e : r,a 

1/ 

e -.r,u 



O : T, succ\ (o) 



e : r 



0 : r, o 0 : r, u; 

01 : Ti, siicci(/3) 02 : T 2 , siicc 2 (/5) 
01,02 : ri,T2,/3 

0 : succ\ (tt) 0 : T, 



0 : 7T 



0 ,¥? : r 



focus 



Table 2. Dyadic sequent calculus A 2 for AiSCC in uniform notation 



In sequent proofs there are two possible notions of occurrence of a formula p: an 
occurrence of p as subformula in some formula tree or its occurrences within a 
derivation. The difference between these two becomes only apparent when con- 
traction is applied. In M£CC only formulas of type v are generic, i.e. may be 
contracted. Since we are aiming at a matrix characterization, we apply contrac- 
tion in an eager way. For this purpose we introduce a function p which determines 
the multiplicity of an occurrence of a formula, i.e. the number of copies of that 
occurrence in a proof. ^ Let 0 and T be multi-sets of signed formulas. A dyadic 
sequent S = 0 : T has two zones which are separated by a colon. 0 is called 
the unbounded zone and T the bounded zone of S. 

The sequent calculus A 2 for dyadic sequents depicted in table 2 employs 
eager contraction. Derivations of a dyadic sequent S are defined with respect to 

^ Wallen’s multiplicities for modal and intuitionistic logics are based on occurrences 
within a formula tree [24] . Our notion respects the resource sensitivity of linear logic. 
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a fixed multiplicity function /i. This function is important whenever the i/-rule is 
applied. Additionally, E '2 uses lazy weakening. There is no explicit weakening rule 
but weakening is done implicitly in the rules axiom and r. The tt rule requires 
that all context formulas are in the unbounded zone instead of requiring them 
to have a specific type as va. A special rule focus moves a formula from the 
unbounded into the bounded zone. A signed formula cp is derivable in T '2 if ■ (f 
is derivable for some multiplicity, where • denotes the empty multi-set. 

Theorem 2 (Completeness). Let Pi be a -proof for Si = T, h>c where h>c 
consists of all signed formulas of type v in Si to which the contraction rule is 
not applied in Pi . Then there is a multiplicity p ,2 such that the dyadic sequent 
S 2 = succi{iy'c) ■ 'P can be derived in 

Theorem 3 (Correctness). Let P 2 be a E' 2 -proof for S 2 = Of : T with 
multiplicity H 2 where 0^ and Of contain only positive and negative signed 
formulas, respectively. Then there exists a -proof Pi for the unary sequent 
Si= wt,iOf,r. 



3.2 Triadic Calculus 



0 : (A,+),(A.-) ^ 
0 : T 41 succi (o) 



o 41 



0 : r 41 o 

0 : T 41 S , succi(a). succ 2 (a) 
0 : r t S,a 

0, succi(iy)'^^^^ : Y E 
0 : r t S,!y 

0 : r 41 V 



0,Y-rt 



focusi 



0 : r 41 V 

"eTrfPiP 



foCUS2 



^ e-.Y^s 

0 - Tt ■ 0 -Yt S,u> 

0 : T 41 succ\{o) 



o t 



0 : T41 S, 

01 : ^1 41 succi(P) 02 : ^2 41 succ2(f3) 
01,02 : ri.T2 41/3 

0 : ■ 41 succi (tt) 



0 : ■ 41 IT 
0 ■. Y,^ it S 



0:YtS.v 



defocus 



0.Yt^ 
0 :Y if cp 



switch 



In focuS 2 must not be of type lit or r. In defocus ip must be of type 
lit. T, /3, or TT. In switch ip must be of type lit. r, uj, a. or o . 

Table 3. Triadic sequent calculus T '3 for M£LL in uniform notation 



During proof search in sequent calculi the order of some rule applications may 
be permuted. For linear logic, the permutabilities and non-permutabilities of 
sequent rules have been investigated in [1,10]. Andreoli’s focusing principle [1] 
allows to fix the order of permutable rules without losing completeness. A dis- 
tinctive feature of this principle is that the reduction ordering is determined for 
layers of formulas rather than for individual formulas. Let (/? be a signed for- 
mula, S' be a sequence and 0 and T be multi-sets of signed formulas. A triadic 
sequent S = 0 : T ]}. ip or S — 0 : T E has three zones. 0 is called the 
unbounded zone, T the bounded zone, and ip or E the focused zone. The sequent 
is either in synchronous (JJ.) or in asynchronous mode ('ff). 

^ For convenience we extend functions and connectives to multi-sets of signed formulas. 
succi(iy-) abbreviates {sMcci(i^) | n € n-}, TO denotes {{lF,k) \ (F,k) G &}, etc. 
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The sequent calculus for triadic sequents depicted in table 3 employs the 
focusing principle. Derivations are defined with respect to a fixed multiplicity 
function /r. The multiplicity is important whenever the rule v is applied. A signed 
formula (p is derivable in Hg if the sequent • : (/? 'ff • is derivable for some p. 

In Hg there are two focusing rules, focus ^ and focus2, which move a signed 
formula into the focus. Both rules switch the sequent into synchronous mode. 
Depending on the structure of the formula this enforces a sequence of rules to be 
applied next. Since selection of these rules is deterministic, the permutabilities of 
rule applications are removed from the search space. The matrix characterization 
developed in this article exploits this focusing principle. However, it yields a 
representation with even less redundancies than a calculus like Hg can. 
Theorem 4 (Completeness). Let V2 be a E'2 -proof for 82= O : T with 
multiplicity 112 and let T' be a linearization ofT. Then there is a multiplicity pz 
such that the triadic sequent S3 = 0 : • (i T' is derivable in U3 . 

Theorem 5 (Correctness). Let V3 be a -proof for S3 = 0 : T E (or 
0 :T if p) with multiplicity p3. Then there is a S'2-proof V2 for the dyadic 
sequent 82= 0 : T, S' (or 0 : T,p) for some multiplicity p2, where S' is the 
multi-set that contains the same signed formulas as the sequence S. 



4 A Position Calculus for AA.SCC 

r e {/3, 7T, X, a} R^ r e {c*, ijj} Rx r e {x, a} R^ r e {x, a} 
a. — " a (3 — *- (3 (3 — » (3 







6 




\ 



re {f3, 7t} 



R^ r e {t, a} 



r e {a, u, u;} 



e {(3, 7T, T, a} 




Fig. 2. Rules for inserting special positions into basic position trees 



In the previous section we have used established techniques for removing re- 
dundancies in the search space of a specific sequent proof. In order to reason 
about logical validity as such, the non-classical aspects of sequent proofs must 
be expressed in a more compact way. Wallen [24] uses prefixes of special posi- 
tions for modal logics and intuitionistic logic. We adapt this approach to MSCC 
and introduce position calculi as intermediate step in the development of matrix 
characterizations. We capture the difference between an occurrence of a formula 
within a proof or as subformula by basic positions and positions. 
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Basic Position Trees Let H be an arbitrary set of basic positions. The ba- 
sic position tree for a signed formula 99 is a directed tree % = (H , Et) which 
originates from a formula tree by inserting additional nodes. ^ These nodes are 
inserted by applying rewrite rules from figure 2 until none of them is applicable. 
The functions succi and succ2 are redefined for basic position trees such that 
succ\{bp) is the left and succ2{hp) the right successor of the node bp. For a basic 
position bp G 14, the corresponding formula, main connective, signed formula, 
polarity, and type can be retrieved by the functions lab, con, sform, pol, and 
Ptype, respectively. For the inserted nodes we introduce new special types cj)^ , 

, 4>^ , and which are assigned according to the rewrite rules. For an in- 
serted basic position of type 4>^ , , or the value of the functions lab, con, 

and pol equals the value of the successor node and for type ip^ it is the value of 
the predecessor node. We use type symbols as meta-variables for basic positions 
of the respective type and a as a meta-variable for basic positions of type lit. 

A rewrite rule R can be applied to a tree T if its left hand side matches a 
subtree T' of T. In this case T' is rewritten according to the pattern on the right 
hand side of R. The dotted lines in the patterns match arbitrary subtrees that 
contain only nodes of type o. A special case are the rules R^j and R.^.. They can 
only be applied if there are just positions of type o between the root and the leaf. 
The other rewrite rules separate layers of subformulas within a formula tree: R*^ 
inserts special positions wherever a subformula of type t\ has a subformula of 
type t2. The rewrite system is confluent and noetherian. 



Example 6. We illustrate the application of the rule R^. Below, we have depicted 
the formula tree Ti for ((A®!yl)^ ?(A-*-), -I-). /?oo is a successor of ao with no 
nodes in between. The subtree consisting of ao, /3oo, and the edge between them 
matches the left hand side of R^. The tree is rewritten to T2 and can further be 
rewritten by applying Rf , R^, and RJ^. The resulting basic position tree is T3. 



Ti 



T, 



% 




<^000 "^OOl °010 

I 1 

<^0010 <^0100 





<^000000 ‘^olyooo 

“0001000 “0100000 



“0100 

*^01000 



p 


con{p) 


OtQ 




TTOOOI ^’4^00010 


0 

! 


1^01 


? 


j,E 

0010 


± 


OOlOO 


_L 


^01000 


_L 



Special positions represent the possible behavior of a layer within a deriva- 
tion. Positions of type and are called constants while those of type 
and (j)^ are variables. Inserted variables express that the corresponding formula 
may be part of the bounded (</>^) or unbounded zone. During a sequence of 
proof rule applications that ends with a constant position, a part of the context 

® We denote basic positions by strings over {0, 1}. 0 is the root position of a tree. 
Extending a tree by 0 or 1 yields the basic position for the left or right successor node. 
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may have to be of a specific type. The requirements of the tr-rule, for instance, 
are expressed by a position. 



Position Trees, Position Forests, and Position Sequents A position tree 
for a signed formula wrt. a multiplicity p, is constructed from a basic position tree 
by subsequently replacing subtrees of positions v with ^{ 1 /) = n with n copies 
of that tree starting from the root. New positions are assigned to the copies.^ 




Fig. 3. Rules for inserting special positions into basic position forests 



As formulas can be represented as formula trees, sequents can be represented 
as sequent forests, i.e. collections of formula trees that are divided into different 
zones. A basic position forest for a triadic sequent is a collection of basic position 
trees which consists of three zones (unbounded, bounded, and focused) and has 
a mode (fj- or 'f|'). The trees are modified by the rewrite rules in figure 3, which 
modify trees only at their roots. The exponent of a rewrite rule defines the 
zone in which it can be applied. Rf,RT,Rf , and R, can be applied to trees 
in the unbounded zone, bounded zone, focused zone (mode 'f|') and focused zone 
(mode U-), respectively. A position forest is a collection of position trees which 
is divided into three zones, has a mode, and which is constructed from a basic 
position forest together with a multiplicity. We use position sequent as well as 
position matrix as a synonym for position tree. We denote trees by their roots 
if the rest of the tree is obvious from the context. Note, that by the definition 
of the rewrite rules a root in the unbounded zone is always of type . A root 
in the bounded zone is of type 4>^ . A root in the focused zone in mode 'ft has a 
type in {o, w, a, v, and a root in mode has a type from {o, (3, tt, , tp^}. 



Example 7. Position sequents can be represented graphically. Let <p^ be the root 
of a position tree which corresponds to ((A0!A)^ ?(A-*-),+) with multiplicity 
m(^oooi)= 2. The position sequent for • : <p^ -ft • is depicted in figure 4. 

^ We denote positions by strings over {0, 1, Positions in different copies of 
a subtree are distinguished by their exponents. 
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00Q1Q2 



OOOIO^O 
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^0 ^Q0^nnmn2 ^nnmn2nn^nnmn2nnn 



Prefixes of leaves (see section 5) 



Fig. 4. Position sequent for the sequent • : <j)^ fi • 



Position Calculus A sequent calculus for position sequents is depicted in ta- 
ble 4. Derivations of a position sequent S are defined with respect to a fixed 
multiplicity function /r for S. The position calculus makes apparent that, as 
pointed out earlier, inserted positions express how the context is affected in cer- 
tain rules. The rewrite rules in figure 3 guarantee that the reduction of a position 
sequent by a rule results again in a position sequent. In a proof, the mode of a 
position sequent can be switched either by or . There is no defocus rule 
as in Ag nor does the tt rule cause a switch. Defocusing is done by the rules 4>^ 
and V. For all other rules there is a corresponding rule in 



O JJ- 
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where Ptype{succi{(j)^ )) — '^ 



<i>f : <i>f succi(/3) <f f : <Z>” SUCC 2 (/3) 
: P^ ,pf‘ J|/3 

P^ , succ\ (ly ) , . . . , succ'^^^'' (i^) : P^ -j)- E 
p’^ ■. p’^ t S , i- 



: j; succx{4>f) 






t- 



foCUS2 

where Ptype{succi{(p^ )) ^ {lit^r} 



: <P^ succiii;^) 
pE . pM ^ 



V 



: ■ 'O' succi{'ipi ) 

: • Jt 0f 






Table 4. Position calculus Spos for A4SCC in uniform notation 



Theorem 8 (Completeness). Let Vz he a -proof for Sz = 0 : T ^ (or 
Sz = 0 '■ T fy ip) with multiplicity pz ■ Then there is a multiplicity Jxz such that 
the corresponding position sequent Sz is derivable in Apos . 



Theorem 9 (Correctness). Let V be a Epos ~pi"oof for S = 'f|' r; (or 

S = : <P^ If q>) with multiplicity p,. Then there is a multiplicity fi such that 
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the corresponding triadic sequent S = sform(tP'®) : sform(^^) -fl- sform(:::) (or 
S = sform(^'®) : sform(^^) -IJ- sform((/j ) ) is derivable in 

5 The Matrix Characterization for AA.SCC 

On the basis of the position calculus we can now develop a compact character- 
ization of logical validity. For this purpose we extend the concept of position 
forests by important technical notions, state the requirements for validity in 
M.ECC, and prove them to be sufficient and complete. We then summarize all 
the insights gained in this paper in a single characterization theorem. 

Ibindamental Concepts. A matrix characterization of logical validity of a 
formula (p is expressed in terms of properties of certain sets of subformulas of p. 
We use positions to achieve a compact representation of p and its subformulas. 

A matrix At is a position forest, constructed from a basic position forest and 
a multiplicity /r. The set of positions in a matrix M is denoted by Pos{M). The 
set of axiom positions AxPos{A4) contains all positions with principal type r 
or lit. The set of weakening positions WeakPos(M) contains all positions with 
principal type uj and all positions of type ir with p,{v) — 0. The set of leaf 
positions is defined as LeafPos{M) — AxPos{M) U WeakPos{M). (3{M) is the 
set of all positions in Pos{M) of type (3. The sets 'f'^(Af), ^'®(Af), 

and are defined accordingly. The set of special positions is defined as 

SpecPos(M) = U ^^(M) U U >F^(Af ). 

A weakening map for Af is a subset of U WeakPos(M). This novel 

concept is required because of the restricted application of weakening in M£CC. 

A path is a set of positions. The set Paths{T) of paths through a position tree 
T is defined recursively by 

— P = {0}, the set containing the root of T, is a path through T. 

— If P U {p} is a path through T then the following are paths 

P U {succi{p), succ 2 {p)} if Ptype{p) = a 

P U {succi{p)} and P U {succ 2 {p)} if Ptype{p) = (3 

PU{sMCCi(p)} \i Ptype\p) G {o,ti, 4>^ ,(j)^ ,ip^} 

P U \J,<^(^p){succ\ (p)} if Ptype (p) = v, and p.{v) > 0 

The set of paths through a set of position trees Tg is defined recursively by 
Paths{%) = 0, 

Paths\{T}) = Paths{T), and 

PathsliT} U A') = {Pi U P 2 I Pi G Paths{T), P 2 G Pat/is(P{)}. 

The set of paths through a matrix is defined by 

Paths{<P^ : IJ. p) = Paths(fP^ U U |p}) and 

Paths\<P^ 3\S) = Paths\<P^ U U S') . 

A path of leaves through At is a subset of LeafPos(M). LPaths{M) denotes the 
set of all paths of leaves through M. Since leaf positions are not decomposed in 
the definition of paths, a path of leaves contains only irreducible positions. 
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A connection in a matrix is a subset of AxPos{A4). It is either a two- 
element set {pi,P 2 } where pi and p 2 are positions with different polarity, 
lab{pi) = lab{p 2 ), and Ptype{pi) = lit = Ptype{p 2 ), or a one-element set {pi} 
with Ptype{pi) = t. A connection C is on a path P if CqP. 

The prefix prej^{p) of a position p G Pos{M) is defined by 



- pre_^(o) 

- preMiv'i) 



tq Hr = Ptype{0) G {(j)^ ,-tp^ 
e otherwise 

prej^{p’)rp'i if r = Ptype{p'i) G {</>^, , ip^} 

preMip') otherwise 



If Pi <C P 2 , i-e. Pi is a predecessor of P 2 in the position tree, then pr’e^(pi) 
is an initial substring of prej^{p 2 )- We denote this by pre j^ppi) ^re j^{p 2 ) ■ The 
prefixes of leaves of the example position sequent are displayed in figure 4. 

A multiplicative prefix substitution is a mapping aM ■ , An 

exponential prefix substitution is a mapping as ■ U U U P^)*. 

A multiplicative exponential prefix substitution is a mapping a : U P^) 

[p’^ [jp’^ \jp^ jp^Y which maps elements from P’^ to strings from {p’^jp’^Y 
only. Substitutions are extended homomorphically to strings and are assumed 
to be computed by unification. 



Complementarity. Matrix characterizations for classical [5] and non-classical 
[24] logics are based on a notion of complementarity. Essentially this means 
that every path through a matrix must contain a unifiable connection. These 
requirements also hold for linear logic but have to be extended by a few additional 
properties. We shall specify all these requirements now. 

In the following we always assume A4 to be a matrix, C and W to be a set 
of connections and a weakening map for A4, and ct to be a prefix substitution. 

— The spanning property is the most fundamental requirement. Each path of 
leaves must contain a connection. A set of connections C spans a matrix A4 
iff for every path P G LPaths{A4) there is a connection C G C with C C P. 

— The unifiability property states that connected leaves must be made identi- 
cal wrt. their prefixes. Furthermore, because of the restricted application of 
weakening in M.ECC, that each position in a weakening map must be related 
to a connection. (C, W) is unifiable if there exists a prefix substitution cr such 
that (1) cr(pre^(pi)) = cr{prej^{p 2 )) for all C G C andpi,p 2 € C and (2) for 
all wp GW there is some C = {p, . . .} G C with a{prej^{wp))Za{prej^{p)). 
In this case a is called a unifier for (C, W). 

A unifier cr of (C, W) can always be modified to a unifier a' such that 
a' {preM{4‘^)) = holds for all <j)^ G W and some connection 

C = {p, . . . } G C. A grounded substitution o' is constructed from a substi- 
tution cr by removing all variable positions from values. If cr is a unifier for 
(C, W) then the corresponding grounded substitution a' is a unifier as well. 

— The linearity property expresses that no resource is to be used twice, i.e. con- 
traction is restricted. A resource cannot be connected more than once and 
cannot be connected at all if that part of the formula is weakened. (C, W) is 
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linear for iff (1) any two connections Ci ^ C 2 & C are disjoint and (2) no 
predecessor (j)^ of a position p in a connection C gC belongs to the set W. 

— The relevance property requires that each resource must be used at least once. 
A resource is used if it is connected or weakened. (C, W) has the relevance 
property for M if for p G Leaf P os (M) either (1) p G C for some C G C, or 
(2) p G W, or (3) some predecessor of p belongs to W. 

— The cardinality property expresses that the number of branches in a sequent 

proof is adequate. It substitutes the minimality property in [8], which would 
require a complicated test for M£CC. A pair (C, W) has the cardinality 
property for M if \C\ + + 1- 

Definition 10. A matrix A4 is complementary iff there are a set of connections 
C, a weakening map W and a prefix substitution a such that (1) C spans M, (2) 
cr is a unifier for (C, W), and (3) (C, W) is linear for A4 and has the relevance and 
cardinality properties. We also say that M is complementary for C, W, and cr. 

The complementarity of a matrix ensures the existence of a corresponding Spos~ 
proof. Each requirement captures an essential aspect of such a proof. Thus, there 
are relations between basic concepts in matrix proofs and Upos-proofs. Paths are 
related to sequents. A connection on a path expresses the potential to close a 
Apos-branch by an application of axiom or r which involves the connected posi- 
tions. A weakening map W contains all positions which are explicitly weakened 
by the rules u) and v (for p,(v) = 0) or implicitly weakened in axiom and r. The 
unifiability of prefixes guarantees that connected positions can move into the 
same Spos branch and that positions in W can be weakened in some branch. 
Linearity and relevance resemble the lack of contraction and weakening for ar- 
bitrary formulas, while cardinality expresses the absence of the rule of mingle, 
i.e. a proof can only branch at the reduction of /3-type positions. 

Since complementarity captures the essential aspects of A'pos-proofs but no 
unimportant details the search space is once more compactified. Problems like 
e.g. context splitting at the reduction of /3-type positions simply do not occur. 

Soundness. The unifiability property requires that each position in a weakening 
map W is related to some connection C & C. Let AssSet{C) be the union of C 
and the set of all positions in W which are related to C. 

A matrix A4 can be seen as a collection of trees, i.e. a forest. Let Tj and 
T 2 be position trees in A4. We add for each connection C edges which link all 
positions in AssSet{C). In order to identify the connected components of the 
resulting graph we define a relation AssRel by 

AssRel{T^,T 2 ) iff 3C & C3p^ Pos{T^),p 2 Pos{T 2 ).pi,p 2 AssSet{C) 
Let ~ be the reflexive transitive closure of AssRel. A connected component in 
A3 is a set of position trees which is an equivalence class of 

We define a function FCons which reduces a set of connections C to those 
connections C whose elements are contained within a certain position forest T . 
We define FGons{T,C) = {C G C | Vp G C.p G Pos(A)} . 
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Theorem 11. Let {C,W) be linear and relevant for the matrix M and a be a 
unifier for (C, W). Then \fi{!F)\ < \FCons{T ,C) \ + holds for 

any connected component T in M. 

Theorem 11 is proven by induction over the size of (3{!F). In the induction step 
one first shows that there is always a node of type j3, the separator of a /3-chain, 
which can be removed such that the preconditions of the theorem hold. The 
induction hypothesis can be applied after removing that node. 

Corollary 12. Let {C, W) be linear for the matrix M and have the relevance 
and cardinality properties for A4. Let a be a unifier for {C,W). Then there is 
exactly one connected component in Ai . 



Lemma 13. Let {C,W) be linear for the matrix M and have the relevance and 
cardinality properties for M = IJ. tpf . Let a be a grounded prefix 

substitution which unifies {C,W). Then for any p G LeafPos(M) there is a 
string s such that a{prej^^(ji)) = -s holds. 

Corollary 12 and lemma 13 show how the prefix substitution guarantees a 
proper context management. Due to the definition of prefix substitutions the 
multi-set <P^ in lemma 13 must be empty. This ensures that the rule is 
applicable in the constructed position calculus proof. 

Theorem 14 (Correctness). If a matrix M. is complementary for a set C of 
connections, a weakening map W, and a substitution a then there is a position 
calculus proof V for M . 

Proof. Define the weight of a matrix At by weighing the number of positions in M 
and use well-founded induction wrt. the weight of matrices. In the induction step we 
perform a complete case analysis of the structure of M. The difficult cases where j3 
or must be applied is shown with the help of corollary 12 and lemma 13. The 
remaining proof is tedious but straightforward. Details can be found in [17]. 

Completeness. For a Upos-proof "P of a matrix A4 we construct a set ConSetfiP) 
of connections, a weakening map WeakMap{V), and arelation \Zv‘TSpecPos{M)"^ . 
The connections in ConSet{P) are constructed from applications of axiom and 
T in V. If axiom is applied on 'fl' • then {succi{(j)^), succi{(j)^)} 

is in ConSetifP). If r is applied on a sequent : <j>f^ 'fl • then {succi{(j)f^)} is 
in ConSet{V). WeakMap{V) contains those elements of WeakPos{M) which are 
explicitly weakened by w or jz and those elements from which are implic- 

itly weakened in axiom or r. IZ-p resembles the order in which special positions 
are reduced. We write p\Z-pp' if P is reduced before p' , i.e. the reduction occurs 
closer to the root of the proof tree V. For any proof V, is irreflexive, antisym- 
metric, and transitive, thus an ordering and <C (for AI) is a subordering of \Zv- 



Lemma 15. Let M. be a matrix, V be a position calculus proof for AA, and 
Pi,P 2 G (Ai)UT^ (Ai)) be positions in A4 withpi If there is a position 
pG SpecPos{AA) with pi\ZvP and p 2 iZ'pP then either pi\Z-pP 2 or p 2 \ZvPi holds. 
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We finally construct the substitution a-p from C-p. Let (f) € U 

be a variable special position in M. We define ap{(j)) = Z = tpi . . .tpn where the 

string Z G has the following properties. 

— sortedness: ipi\Zvipi+i holds for alH G { 1 , . . . , n — 1 }. 

— prior reduction: 'tpi\Zp4' holds for alH G {1, • • • , n}. 

— exclusivity: p <t. (j) ^ plZpipi holds for all p G SpecPos{M). 

— maximality: For any tpG not in Z with xpnpcj) exists a p<C(^ 

with 'ip\Zpp. 

According to the structure of Ifpos-proofs, ap really is a prefix substitution. The 
construction guarantees that for any (f>^ G holds ap G (A4))* . 

Theorem 16 (Completeness). LetV be a position calculus proof for a matrix 
M.. Then Ai is complementary for ConSetifP), WeakMapiV), and ap. 

Proof. Each of the following properties is proven by induction over the structure of V. 
(1) ConSet(V) spans M. (2) ap is a unifier for (ConSefiP), WeakMap(V)) . 
(3) {ConSetiV), WeakMap{P)) is linear, has the relevance property for At, and has 
the cardinality property for At. The individual proofs are lengthy because induction 
over P and case analysis is necessary. Details can be found in [17]. 

The Characterization. The characterization theorem proven in this section 
is the foundation for matrix based proof search methods. It yields a compacti- 
fied representation of the search space which can be exploited by proof search 
methods in the same way as for other logics [20] . The method has been extended 
uniformly to multiplicative linear logic, as shown in [15]. Along the same lines 
an extension to MSCC is possible. 

Theorem 17 (Characterization Theorem). A formula Lp is valid in AiSCC 
if and only if the corresponding matrix is complementary for some multiplicity. 
Proof. Correctness follows from theorems 14, 9, 5, 3, and the correctness of Ef Com- 
pleteness follows from theorems 16, 8, 4, 2, and the completeness of Ef 



Example 18. Let At be the matrix for 99 = ((A 0 !A)^ ?(A-*-),-|-) from figure 4. 
We choose C = {{aooooooooi aoooio^oooo}) {aoooooiooo, aoooioioooo}}) W = 0, and 

= { 4 ' w ) oo \^^ '/’otoooooW'oooio^oo' ^otoooioo\V’(tooioioo> '^tooioAV'otoooioi 

^TO010iOOo\^’ '/’OT0102\'*/’000000 i '^OT010200o\^} ■ 

Then At is complementary for C, W, and cr. Consequently (p is valid in A4ECC. 



6 Conclusion 

We have presented a matrix characterization of logical validity for the full multi- 
plicative exponential fragment of linear logic {A4SCC). It extends our characteri- 
zation for M.LL [15] by the exponentials ? and ! and the multiplicative constants 
1 and T. Our extension, as pointed out in [ 8 ], is by no means trivial and goes 
beyond all existing matrix characterizations for fragments of linear logic. 
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In the process we have also outlined a methodology for developing matrix 
characterizations from sequent calculi and for proving them correct and com- 
plete. It introduces a series of intermediate calculi, which step-wisely remove re- 
dundancies from sequent proofs while capturing their essential parts, and arrives 
at a matrix characterization as the most compact representation for proof search. 

If applied to modal or intuitionistic logics, this methodology would essentially 
lead to Wallen‘s matrix characterization [24]. In order to capture the resource 
sensitivity of linear logic, however, we have introduced several refinements. The 
notion of multiplicities is based on positions instead of basic positions. Different 
types of special positions are used. The novel concept of weakening maps makes 
us able to deal with the aspects of resource management. 

Fronhofer has developed matrix characterizations for various variations of the 
multiplicative fragment [8]. Compared to his work for linear logic our charac- 
terization captures additionally the multiplicative constants and the controlled 
application of weakening and contraction. In fact, we are confident that our 
methodology will extend to further fragments of linear logic as well as to other 
resource sensitive logics, such as affine or relevant logics. 

In the future we plan to extend our characterization to quantifiers, which 
again is a non-trivial problem although much is known about them in other 
logics. Furthermore, the development of matrix systems as a general theory of 
matrix characterizations has become possible. These systems would include a 
uniform framework for defining notions of complementarity and a methodology 
for supporting the proof of characterization theorems. Matrix systems might also 
enable us to integrate induction into connection-based theorem proving. 

Matrix characterizations are known to be a foundation for efficient proof 
search procedures for classical, modal and intuitionistic logics [20] and MCC [15]. 
We expect that these proof procedures can now be extended to MECC and a 
wide spectrum of other logics, as soon as our methodology has led us to a matrix 
characterization for them. 

Acknowledgements. We would like to thank Serge Autexier for his patience 
and his valuable comments while we were discussing the details of this work. 
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Abstract. This paper applies resolution theorem proving to natural 
language semantics. The aim is to circumvent the computational com- 
plexity triggered by natural language ambiguities like pronoun binding, 
by interleaving pronoun binding with resolution deduction. To this end, 
disambiguation is only applied to expressions that actually occur dur- 
ing derivations. Given a set of premises and a conclusion, our resolution 
method only delivers pronoun bindings that are needed to derive the 
conclusion. 



1 Introduction 

Natural language processing (NLP), has a long tradition in Artificial Intelligence, 
but it still remains to be one of the hardest problems in the area. Research areas 
such as semantic representation and theorem proving with natural language 
have to deal with a problem that is characteristic of natural languages, namely 
ambiguity. There are several kinds of ambiguity, see for instance [RN95] for an 
overview. In the present paper, we focus on pronoun binding,^ a certain instance 
of ambiguity, as exemplified by (1) below. 

(1) A man sees a boy. He whistles. 

Often, there are lots of possibilities to bind a pronoun and it is not clear which 
one to choose. The pronoun he in the short discourse in (1) can be bound in two 
ways as given in (2), where co-indexation indicates referential identity. 

(2) a. A mani sees a boy. He^ whistles, 
b. A man sees a boy^. He^ whistles. 

For some cases heuristics are applicable which prefer certain bindings to others, 
but at present there is no approach making use of heuristics which is general 
enough to cover all problems. 

Dynamic semantics [Kam81,GS91] allows to give a perspicuous solution to 
some problems involving pronoun binding. Since we are interested in binding 
occurrences of pronouns to expressions mentioned earlier in a discourse, we take 

^ Throughout this paper we use the term binding to express the referential identifi- 
cation of a pronoun and another referential expression occurring in the discourse. 
Common terms are also co-indexation or pronoun resolution. We especially did not 
use pronoun resolution to avoid confusion with resolution as a deduction principle. 
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a slight modification of Dynamic Predicate Logic (DPL) [GS91], where it is not 
presupposed that pronouns are already co-indexed. Actually, pronoun binding 
falls into the realm of constructing semantic representations of natural language 
discourses, and one of the main purposes of constructing these representations 
is to reason with them. Now, the question arises which form the input of the 
theorem prover should have. Should a theorem prover work only on totally disam- 
biguated expressions? Total disambiguation results in an explosion of readings, 
because of the multiplicative behavior of ambiguity. On the other hand, to prove 
a conclusion ip from a set of premises F it may be enough to use only premises 
from a small subset Z\ of T, and it may be sufficient, and much more efficient, 
to disambiguate only A instead of the whole set of premises F. In general, we 
do not know in advance which subset of premises might be enough to derive a 
certain conclusion, but during a derivation often certain (safe) strategies may 
be applied that prevent some premises from being used since they cannot lead 
to the conclusion, anyway. Common strategies to constrain the search space in 
resolution deduction are e.g., the set-of- support strategy and ordered resolution. 
Our goal is to constrain the set of premises that have to be disambiguated by 
interleaving deduction and disambiguation. Roughly speaking, premises are only 
disambiguated if they are used by a deduction rule. 

The rest of the paper is structured as follows. Section 2 provides some rudi- 
mentary background in dynamic semantics and explains what kind of structural 
information is necessary to restrict pronoun binding. In addition, the basics of 
resolution deduction are introduced. Section 3 discusses some of the problems of 
the (standard) resolution method when applied to natural language. The method 
of labeled unification and resolution is presented to overcome these problems. 
Section 4 briefly relates our work to some other approaches to pronoun binding. 
Section 5 provides some conclusions and prospects for further work. 

2 Background 

Before we turn to our method of labeled resolution deduction and its applications 
to discourse semantics, we briefly present the idea of dynamic semantics. The 
second subsection shortly explains the classic resolution method for (static) first- 
order logic. 



2.1 Dynamic Reasoning 

Dynamic reasoning differs from classical reasoning to the extent that sequences of 
formulas are considered instead of sets of formulas. To model discourse relations 
like pronoun binding it is important to take the order of sentences into account 
because two sequences which have the same members, but differ in order, may 
have a different meaning. (Compare ‘A man walks in the park. He whistles.’ and 
‘He whistles. A man walks in the park.’) 

DPL is a semantic framework which works on sequences of formulas and 
it allows to represent pronoun binding, where the antecedent of the pronoun 
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and the pronoun itself may occur in different formulas. This is accomplished by 
assigning the existential quantifier flexible binding. In (3.b) a DPL representation 
of the short discourse in (3. a) is given. 

(3) a. A maui sees a boy. He^ whistles. 

b. 3x {man{x) A 3y {boy{y) A see{x, y))) A whistle{x) 

The pronoun he is represented by the variable x which is the same as the one 
bound by the existential quantifier, but it occurs outside of its scope. To bind x 
in whistle{x) it is necessary to give the existential quantifier flexible scope. 

One of the advantages of dynamic approaches like DPL is that they allow 
for a formal definition of possible antecedents for a pronoun. Without giving 
too many details, we just note that negations function as barriers for flexible 
binding. Therefore, an existential quantifier occurring in the scope of a negation 
cannot bind a pronoun that occurs outside of the negation, as shown by (4). 

(4) *John doesn’t own a car^. Iti is in front of his house. 

The three properties (a) existential quantifiers can bind variables occurring 
to the right-hand side of their traditional scope, (b) conjunctions preserve the 
flexible scope, and (c) negations are barriers for dynamic binding, allow us to 
define the properties of the other logical connectives V, ^ and V. | • ] is a function 
that assigns to each formula its semantic value. 

(5) [[(/jV'i/’I = [[-(-</? A 

IWx ifj = l~^3x 

Given these definitions, we see that disjunction is a barrier both internally and 
externally, implication is a barrier externally but internally it allows for flexible 
binding, and universal quantification does not allow for external binding. 

We differ in two respects from DPL. First, we do not allow two or more 
occurrences of 3x within a single text. The problem is that the second occurrence 
of 3a: resets the value of x, and thereby previous restrictions on x are lost. 
We assume for simplicity that all bound variables are disjoint. This is not a 
severe restriction and an algorithm for constructing semantic representations 
for natural language sentences can easily accomplish this. The second difference 
with DPL is that we do not assume co-indexation of quantifiers and the pronouns 
which they bind. In (3) the variable for he is already assumed to be x and in 
DPL the question of pronoun binding is pushed to some kind of preprocessing. 
But finding the right binding is far from being an easy task and it is very 
complex from a computational point of view. The pronoun in (3) could also 
be represented by y, indicating that that he refers to a boy. E.g., a discourse 
containing twenty indefinites followed by a sentence with two pronouns, has 
20-20 = 400 possible bindings, disregarding any linguistic constraints which 
rule out some of the bindings. 

To this end, we postpone pronoun binding and represent pronouns in the 
semantic representation by free variables. Variables for pronouns are displayed 
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in boldface and are of a different kind than regular variables. Pronoun variables 
are bound by the ?-operator. It differs from 3 and V, because it only binds its 
argument, but does not quantify over it. Actually, it is not necessary to have 
a special operator for pronouns, and we only introduced it here for the sake 
of convenience to identify the position where the pronoun is introduced. Our 
representation of (1), repeated as (6. a) below, is given in (6.b). As mentioned 
before, co-indexation of pronouns and antecedents is not carried out. 

(6) a. A man sees a boy. He whistles. 

b. 3x {man{x) A 3y {boy{y) A see(x, y))) A ?uvjhistle(u) 

The task whether u has to be substituted by x or by y is postponed to the 
deduction component, as motivated in Section 1. 

Unlike the existential quantifier, the ?-operator does not have the property 
of flexible binding. We get the following equivalence: 

[[^?U(/5]] = [[?U^(/J]] 

To define accessibility we can now say that a variable x is accessible from a 
pronoun u if no barrier occurs between the quantifier introducing x and ?u. A 
formal definition of accessibility is given in the next section. The equations in 
(5) show that V, ^ and V introduce barriers because of the way they are defined 
in terms of negation. This is exemplified by (7) below. 

(7) *Every farmer owns a donkey^. Iti is grey. 

Dispensing with the presupposition that pronouns and antecedents are al- 
ready co-indexed re-introduces the concept of ambiguity to our framework. This 
makes it necessary to give a definition of the semantics of ambiguous formulas. It 
is common to define their semantics in terms of their possible disambiguations, 
see [Rey93], and here we follow the same approach. A total disambiguation is 
a mapping i5 from ambiguous dynamic formulas to classical first-order formu- 
las. Disambiguation encompasses two steps. First, we have to find a proper 
antecedent for a pronoun. To define proper antecedents, we use the notion of 
accessibility. Second, we have to map unambiguous dynamic formulas to classi- 
cal formulas. This means that we have to turn flexible quantification into static 
quantification, and this involves re-bracketing and quantifier movement. [GS91] 
give an algorithm that computes for each DPL-formula (p a formula p' which 
is in normal binding form, i.e., all pronouns are quantified over in the classical 
sense, and which is valid in first-order logic iff p is valid in DPL. For instance, 
the normal binding form of (8.b) is (9). 

(8) a. If a farmer^ owns a donkey^, then he^ beats itj. 

b. 3x (f{x) A 3y (d{y) A o(x, y))) b{x, y) 

(9) 'ix'iyifix) Ad{y) Ao{x,y) b{x,y)) 

To define the validity of ambiguous formulas, we say that an ambiguous 
formula p is valid, i.e., for all models M it holds that M P, if there is a 
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disambiguation S, such that M ^ S{(p), for all models M. In words: (p is valid 
iff there exists a disambiguation which is valid in first-order logic. 

Unfortunately we do not have enough space to give a more detailed account 
of dynamic semantics, but we refer the reader to [Kam81,GS91]. 



2.2 The Resolution Method 

The resolution method [Rob65] has become quite popular in automated theo- 
rem proving, because it is very efficient and it is easily augmentable by lots of 
strategies which restrict the search space, see e.g., [Lov78]. On the other hand, 
the resolution method has the disadvantage of presupposing that its input has 
to be in clause form, which is a set of clauses, interpreted as a conjunction. A 
clause is a set of literals, interpreted as a disjunction. Probably the most attrac- 
tive feature of resolution is that it has only one inference rule, the resolution rule: 



UUl-Pi, 



•,^Pra} ilU {Qi, . . . , Qm} 

(CUDTrjcr 



(res) 



where • Qi, . . . , Qm are atomic 

• 7T is a substitution such that C U {^Pi, . . . , ^Pn} and 
Dtt U {Qi7t, . . . , QmTr} are variable disjoint 

• CT is the most general unifier of {Pi, . . . , P„, Qitt, . . . , Qm^r} 

To prove that F \= tp holds we transform (/\ F) A ~^(p in clause form and try to 
derive a contradiction (the empty clause) from it by using the resolution rule. 
For a comprehensive introduction to resolution see for instance [Lov78]. 



3 Dynamic Resolution 

Applying the classical resolution method to a dynamic semantics causes prob- 
lems. Below we will first discuss some of them and then see how we have to 
design our dynamic resolution method to overcome these problems. 



3.1 Adapting the Resolution Method 

There are two problems that we have to find a solution for. First, transforming 
formulas to clause form causes a loss of structural information. Therefore, it 
is sometimes impossible to distinguish between variables that can serve as an- 
tecedents for a pronoun and variables than can not. The second problem concerns 
the duplication of literals which may occur during clause from transformation 
and the assumption of the resolution method that clauses are variable disjoint. 
Although the same pronoun may have two occurrences in different clauses, we 
do not want them to be bound by different antecedents. 

Turning to the first problem, in (10) the pronoun u cannot be bound by the 
existential quantifier, whereas the pronoun z can be bound by it. 
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(10) a. Every farmer who owns a donkey beats it. It suffers, 
b. Vx (f{x) A 3y {d{y) A o{x, y)) ^?z b{x, z)))A?u s(u) 

(11) { ~^d{y), -^o{x, y), b{x, z)}, {s(u)} } 

How can we tell which identifications are allowed by looking at the corresponding 
clause form in (11)? How do we know whether a term is accessible? 

We use labels to carry the information about accessible variables. Each pro- 
noun variable is annotated with a label that indicates the set of accessible vari- 
ables. Besides the set of first-order or proper variables ( VAR), first-order formulas 
{FORM), and pronoun variables {PVAR), we are going to introduce the sets of 
labeled pronoun variables {LPVAR) and labeled formulas (LFORM). Labeled 
pronoun variables are of the form V : u, where V C VAR and u is a pronoun 
variable. LFORM is the set of first-order formulas plus formulas containing la- 
beled pronoun variables. To be able to recognize the antecedents later on, each 
variable is annotated with its name, {x^, y^ , . . .), and during skolemization only 
the variable is changed, but the label remains unchanged. 

To see which variables inside of a formula (p can serve as antecedents for 
pronouns, [GS91] introduce the function AQV which returns the set of actively 
quantifying variables when applied to p. 

Definition 1. Let FORM be the set of classical first-order formulas and VAR 
the set of first-order variables. The function AQV : FORM POW{VAR) is 
defined recursively: 



AQV(i?(xi...x„)) 
AQV(^(/j) 
AQV((/j A tp) 
AQV{p tp) 
AQV{p V Ip) 
AQV(Vx p) 
AQ\J{3x p) 
AQ\J{lup) 



0 

0 

AQV{p) U AQV(V’) 
0 
0 
0 

AQV{p) U {x} 
AQ\J{p) 



Using the above definition we define the notion of accessible variables. 



Definition 2 (Annotation with Accessible Variables). To annotate u in 
Imp, we drop the binding operator ?u and substitute all occurrences of the 
pronoun variable in xp by its annotated counterpart. The annotation function 
annot : VAR x FORM — > LFORM is defined recursively, where V C VAR: 



annot(U, R{xi . . . Xn)) 
annot(U, ~^p) 
annot(U, p Axp) 
annot(U, p ^ xp) 
annot(U, p\J xp) 
annot(U, 
annot(U, 3xp) 
annot(U, ?ui^) 



R{xi . ..Xn) 

^annot(U, p) 

annot(U, p) A annot(U U AQ\J{p), xp) 
dtmot{V, p) annot(U U AQV((/j), ■)/:) 
annot(U, p) V annot(U, xp) 

Vx annot(U U {x}, p) 

3x annot(U U {x}, p) 
annot(U, p\vl/V : u]) 
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The actual annotation takes place in the last case, where the pronoun is sub- 
stituted. The other cases thread the actively quantifying variables through the 
formula. To annotate a whole discourse (/?i A • • • A , the variable parameter of 
annot is initialized with 0, annot(0, (/Ji A • • • A (/?„)• A term is accessible from a 
pronoun u iff x is element of the set of the accessible variables of u. 

Reconsider the last example, every farmer who owns a donkey beats it. It 
suffers. Applying annotation yields:^ 

annot(0, Vx (f{x) A3y{d{y) A o{x, y)) ^?z b{x, z))A?u s(u)) 

= Vx (f{x) A3y{d{y) A o{x, y)) b(x, {x, y} : z))) A s(0 : u) 

Applying clause form transformation to the annotated formulas yields: 

(12) { {-'/(x), ~^d{y), -no(x, y), b{x, {x, y} : z)}, {s(0 : u)} } 

We can also see that (10. a) is not well- formed because there are no accessible 
pronouns for the second pronoun it, i.e., the label of u is the empty set. 

Now we turn to the second problem: how do we make sure that the same 
pronoun, occurring in different clauses, is bound to the same antecedent? As 
we said earlier, we do not want to assume pronouns to be bound in a set of 
premises when we apply resolution. The reason is that pronoun binding is highly 
ambiguous and often it is not necessary to bind all pronouns in a set of premises 
to derive a certain conclusion from it. Another issue, which we briefly hinted at 
in Section 2, is that pronouns should be treated as free variables of a special 
kind, not to be dealt with in the same manner as universally quantified variables 
(which also happen to be represented by free variables). This is illustrated by 
the following example, which shows an invalid entailment. 

(13) a. 3y {{A(x) V A{y)) A (?zR(z) ^ {B A C))) B W C 

b. { {A{f% A{gy)}, {-A(z), B}, {-R(z), C}, {^B}, {^C} } 

The transformation in (13) causes a duplication of the literal —'A(z), and we have 
to make sure that the pronoun is instantiated the same way in both cases. 

(14) {A{n,A{gy)} {^A{z),B} {^A{z),C} {-5} {-C} 




^ For simplicity, we neglect the fact that pronouns and their antecedents have to agree 
in gender, number, etc. 
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In (14) z is instantiated with /“ in the first resolution step and then with in 
the second. The resolution rule as it was stated in the preceding section assumes 
that clauses to be resolved are variable disjoint. We have to modify the resolution 
rule such that the same pronoun variable is allowed to occur in both clauses. 
Additionally, the instantiation of a pronoun variable for constructing the most 
general unifier in a resolution step is applied globally, i.e., to all clauses. 



(15) {A{n,A{gy)} {^A{z),B} {^A{z),C} {-B} {-C} 

{A(gy),B} ^A(n^ 

{A{gy)} {Air)} 

Global instantiation correctly prevents us from deriving a contradiction in (15). 



3.2 Labeled Resolution 

Unification is a fundamental technique in the resolution method. Since we are 
also dealing with labeled variables, we have to think how the unification mech- 
anism has to be adapted. In the course of this subsection, it will turn out that 
pronoun binding can be reduced to unification. 



Labeled Unification. We use the unification algorithm of Martelli and Mon- 
tanari [MM82] as a basis and adapt it in such a way that it can deal with labeled 
pronoun variables. 

What does it mean to unify a set of equations E = {si « ti, . . . , « t„}, 

where Si or ti can also be a labeled pronoun variable? We have to distinguish 
three possible cases: (i) neither Si nor ti is a labeled pronoun variable, then 
labeled unification and normal unification are the same thing, (ii) one of them 
is a pronoun and the other is not, and (iii) both are pronouns. Case (ii) is the 
normal pronoun binding, where one tries to identify a pronoun with a proper 
variable. Case (iii) is not an instance of pronoun binding, but an identification 
of two pronouns, i.e., whatever is the antecedent of the first pronoun, it is also 
the antecedent of the other one. 

Definition 3 (Labeled Unifier). We call a substitution cr a labeled unifier or 
unifier* of a set of equations E = {si « ti, . . . , s„ « iff 

1. SifT = ticr, . . . , SjiCr — tjiCr 

2. if iy : u)cr = U, then x & V 

3. if (U:u)cr = U':vthen V' C V 

We use « to express equality in our object language, whereas = denotes equality 
in the meta language. 

Condition 1 is the normal condition of unifiability, namely that the terms of 
an equation have to be identical after substitution. The second condition says 
that unifiers have to obey accessibility, for instance ct := [{x, y} : vi/g^] is not a 
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unifier of {{x,y} : u « because is not accessible from u, as ^ ^ {x,y}- 
To ensure that identification of pronouns always restricts the set of accessible 
antecedents, we need condition 3. 

Definition 4 (Most General Labeled Unifier). A labeled unifier ct of a set 

of equations E = {si ~ ... ,Sn ~ tn\ is the most general labeled unifier or 

mgu* of E if 

1. if 0 is a unifier* of E then there is substitution r such that 9 — ar 

2. if (U:u)fj = Vi:v, (U : u)6» = U 2 : v, Vi, U 2 C U, andVi,U 2 y ^0 
then V 2 C Vi 

Again, the first condition is standard in regular unification. Condition 2 says 
that the most general unifier* has to restrict the set of accessible antecedents as 
little as possible when identifying pronouns. To unify Vi : u and V 2 : v it suffices 
to take any non-empty subset of the intersection of Vi and V 2 , but this fact 
may prohibit some antecedents from being accessible, although they are in fact 
accessible for both pronouns. 

Definition 5 (The Labeled Unification Algorithm). First, the unification 
function unify* is applied to a pair of atoms, and then it tries to unify the set 
of corresponding argument pairs. The algorithm terminates successfully if it did 
not terminate with failure and no further equations are applicable. 

1. unify* (.R(si...s„),i?(ti..t„)) 

= unify*({si « ti...s„ « t„}) 

2. unify*({/(si...s„) « /(ti...t„)} U E) 

= unify*({si « ti...s„ « U E) 

3. unify*({/(si...s„) « g{ti...tm)} U A), / yf g or n yf m 
= terminate with failure 

4. unify*({x w x} U A 
= unify*(A) 

5. unify*({t « x} U E), t ^ VAR 
= unify*({x t}U E) 

6. unify*({x « t} U E), x ^ t, t ^ LPVAR, x in t 
= terminate with failure 

7. unify*({x « t} U E), x ^ t, t ^ LPVAR, x not in t, x in E 
= unify*({x « t} U E[x/t]) 

8. unify*({U:u«t^}U£;), x GV,V:uin E 
= unify*({U:u«U}UA[U:u/U]) 

9 . unify*({Vi : u « U2 : v} U A), U n U2 7^ 0 , Ui n U2 C U2 
= unify*({Vi:u« Vi nU2:v,U2:v« Vi nU2:v} 

UE[Vi : u/Vi n U 2 : V, U 2 : v/U C U 2 : v]) 

The first six equations of the algorithm are the same as in [MM82], except for 
additional side conditions which make sure that t is not a labeled variable. The 
interesting cases are 8 and 9 . In 8 a pronoun is bound to an antecedent and in 9 
two pronouns are identified, i.e., they have the same possible antecedents, namely 
those which are accessible for both of them. This is accomplished by identifying 
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the pronoun variables and substituting the set of possible antecedents by the 
intersection of the possible antecedents of each pronoun. 

Identification of pronouns underlies different constraints than binding a pro- 
noun to a proper antecedent. To identify two pronouns u and v, it is not required 
that u is accessible from v, or the other way around. But they can only be iden- 
tified if they have at least one proper accessible antecedent in common. 

(16) Buk is a poet. For every man there is a woman who hates him. 

\=a There is a woman who hates him. 

(17) p{b) A \/x{w{x) — > 3y{w{y)/\l\i h{y, u))) 

)=a 3z{w{z) A7-V h{z,-v)) 

For instance, in (16) the conclusion is only valid if the first and the second occur- 
rence of him are identified. In Section 2 it was said that universal quantification 
is a barrier for flexible binding, and therefore the second occurrence of him can- 
not be bound to the first one. On the other hand, both of them have a proper 
antecedent in common, namely the constant b representing the proper name 
Buk. In addition, the first occurrence of him has the variable x as an accessible 
antecedent, introduced by the universal quantification every man. If one wants 
to identify them, one has to take the intersection of both sets of accessible an- 
tecedents and hence drop a; as a possible antecedent. Observe that identification 
of pronouns still leaves some space for underspecification, because the intersec- 
tion of two pronouns does not have to be a singleton. Of course, identifying two 
pronouns, where more than one antecedent is accessible for both, forces them 
to be bound to the same element of the intersection. Both can be bound to any 
element of the intersection, but it has to be the same one for both pronouns. 

If the unification algorithm terminates successfully for a pair of literals P,Q, 
the solved set determines a substitution cr that is the mgu* of P,Q: 

a := {s/t I s « t e unify*(P, Q)}. 

A set of equations {si « ti, . . . , s„ « is called solved if 

1. Si G VAR U LPVAR and the Si are pairwise disjoint 

2. no Si occurs in a term tj (1 <i,j <n). 



Lemma 6 (Correctness of the Unification* Algorithm). Let E be a set 

of equations and unify* (FI) = E' , then 

(i) E is unifiable* iff E' is unifiable* 

(ii) a is the mgu* of E iff a is the mgu* of E' 

Proof, (i) We have to show that actions 2, 4, 5, 7, 8, and 9 preserve unifiability*, 
when unify* is applied to a unifiable* set E. For 2, 4, and 5, this is obvious. To 
show it for 7, note that r := [x/t] is a unifier* of x and t. If cr is a unifier* of 
{a; « t} U A then a is of the form rp. Because rr = r, it holds that a — rp = 
TTp = ra. Therefore a unifies* {x ~ t]iJ E iE a unifies* {x ^ t} U E[x/f\. 8 is 
analogous to 7, plus the additional side condition that x G V. The last case is 
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9. If {Vi : u « V 2 : v} U if is unifiable*, then it is with a unifier* a of the form rp 
with T := [Vi: u/Vi fl V 2 : v, V 2 : vl/Vx fl V 2 : v ] . 

Again, a = rp = rrp = to and then cr also unifies* 

{Vi :u « Vi n ^2 : V, 1 ^ 2 : V « Vi n ^2 : v} U A[Vi :u/Vi n ^2 : v, ^ 2 : v/Fi n ^2 : v]. 

(ii) The actions 2, 4, 5, 7, and 8 turn a set of equations into an equivalent 
one. For cr to be the mgu* of {Vi : u « V 2 : v} U if means according to our 
definition that a has to be of the form rp, where 

T := [Vi:u/Viny2:v,y2:u/Viny2:v]. 

But then a is also the mgu* of 

{ Vi : u « n 142 : V, V 2 : V « Vi n V 2 : v} U if [Vi : u/Vi n 142 : v, V 2 : v/Vi fl 142 : v] .□ 



Lemma 7 (Termination of the Unification* Algorithm). The unification* 
algorithm terminates for each finite set of equations. 

Proof. If rules 3 and 6 are applied, we are done. Otherwise, rule 7 can be applied 
only once, because after application the side condition is no longer fulfilled. In 
9 it is presupposed that Vi 0 142 is a proper subset of 142 ; this ensures that an 
application of 9 really reduces the set of possible antecedents. Because 9 can be 
applied only a finite number of times, it can reintroduce a term V : u only finitely 
often, therefore rule 8 can also be applied only finitely many times. Rules 1, 5, 
and 6 are only applied once, and the number of possible applications of rule 2 
is finite as well, because terms contain only finitely many symbols. Therefore all 
rules can be applied only finitely many times, and termination follows. □ 



Proposition 8 (Total Correctness of the Unification* Algorithm). 

The unification* algorithm computes for each finite set of equations E a solved 
set, that has the same mgu* as E in finitely many steps iff E is unifiable* . 

Proof. The fact that the unification* algorithm preserves unifiability* and that 
it terminates has been proven in Lemma 1 and 2, respectively. It remains to be 
shown that the set of equations computed by the algorithm is a solved set. In 7, 
8, and 9, the left side of the equation is always substituted in E by the right side 
of the equation. If the left side is identical to the right side, the equation is erased 
by rule 4. Therefore, no left side of an equation occurs somewhere else. □ 



The Resolution Method. Having defined labeled unification, it is straightfor- 
ward to adapt the resolution principle. The only thing we have to change is to 
make sure that variable disjointness applies only to proper variables (elements 
of VAR). The function VAR returns the set of proper variables, when applied to 
a set of clauses A : VAR(A) = {x € VAR\ x occurs in Z\}. The resolution rule 
accomplishing pronoun binding (reSp) is defined as follows: 



UU{-Pi, 



{CUDTT)a 



• j Qm } 



(resp) 
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where • Qi, . . . , Qm are atomic 

• 7T is a substitution such that 

VAR{C U {-Pi, . . . , -P„}) n ( VAR{D U (Qi, . . . , Q,„}))7r = 0 

• cr is the mgu* of (Pi, . . . , P„, Qitt, . . . , Q^tt} 



Definition 9 (The Proof Algorithm). Our proof algorithm prf consists of 
three steps: 

1 . annotate the conjunction of the premises and the negation of the conclusion; 

2. apply clause form transformation; and 

3. apply the resolution rule until a contradiction can be derived, or no new 
resolvents can be generated. 



An Example. We will only give a very short, and therefore very simple example 
of a labeled resolution derivation. We hope that it illustrates some of the aspects 
of labeled resolution mentioned before. 

Consider example (16) again, here repeated as (18), where (19) is the corre- 
sponding semantic representation. 

(18) Buk is a poet. For every man there is a woman who hates him. 

There is a woman who hates him. 

(19) p{b) A \/x{w{x) — > 3y(tc(y)A?u h{y, u))) 

[=a 3z{w{z)A7-v h{z,-v)) 

Annotating (19): 

annot(0,p(6) A yx{w{x) 3y{w{y)Alu h{y, u))) A —3.2(tc(.2)A?v h{z, v))) = 
p{b) A 'ix{w{x) — > 3y{w{y) A h{y, {b, x} : u))) A ~<3z{w{z) A h{z, {b} : v))) 

Clause form transformation: 

{p(6'’)}, {m{h^)},{^m{x^),w{P)},{^m{x^), h{P, {b, x} : u)},{-.w( 2 ^), {b} : v)}, 

where the additional clause {m{h^)} stems from the assumption that the domain 
of men is nonempty. 

Resolution: 

{p{b^)} {-m(x^),w(/^)} {-.m(a;^),/i(/^, {6, a;} : u)} {^w{z^),^h{z^,{b}:v)} 




Actually, the only remarkable step in the derivation is resolving 
{^m{x),h{f,{b,x}:u)} and {-iu( 2 ), -/i(z, {6} : v)} 
with {^m{x),^w{f)} as the resolvent. Here, the two labeled pronoun variables 
can be identified, because the intersection of their accessible antecedents is 
nonempty. The corresponding mgu* of 




196 Christof Monz and Maarten de Rijke 



{^m{x^), h{P, {6, x) : u), {6} : v)} 

is a := [x^ /z^ , z^ j P , {b, x} :u/{6} :v]. 

Note also, that although p(&) introduced the antecedent b, it is not used in the 
derivation because all information that is necessary to derive the contradiction 
is captured by the labels. This is the advantage of using labels; it allows us to 
express non-local dependency relations in our framework, which is essential for 
dealing with pronoun binding in dynamic semantics where a pronoun and its 
antecedent can occur in different formulas. 



Evaluation from a Linguistic Point of View. In general, it is not enough 
if one gives just the information that there is a binding that allows to derive 
a conclusion, but one also wants to know which binding. It is easy to augment 
our method in a way such that it accomplishes this simply by memorizing the 
substitutions of pronoun variables that occur during a derivation. 

From a linguistic point of view, one is also interested in comparing different 
bindings. If we force the proof procedure to backtrack every time it has found 
a binding which allows to derive a contradiction, we can generate all possible 
bindings. Probably some of the bindings are preferable to others by taking lin- 
guistic heuristics for pronoun resolution into account, see for instance [GJW95], 
but this is beyond the scope of the present paper. 

3.3 Results 

Before we prove completeness and soundness of our method, we have to explain 
what these notions mean in our setting. 

To show that the resolution principle is correct we have to find the right 
loop invariant. We will show that if the parent clauses of a resolution step are 
strongly satisfiable, then so is the resolvent. 

Definition 10 (Strong Satisfiability). We say that a clause C is strongly sat- 
isfiable if there is a model M and for all substitutions 9 from PVAR to VAR), 
then there is a literal L G C9, such that M \= L. 



Lemma 11. Let C U {^Pi, . . . , “'Pn} and D U {Qi, . . . , Qm} be variable dis- 
joint and strongly satisfiable. If a is the mgu* of {Pi, . . . , P„, Qi, . . . , Qm}, then 
C'(t\{^Pi(t} U D(t\{Pi(t} is strongly satisfiable. 

Proof. The set of possible disambiguations of the resolvent is a subset of the 
possible disambiguations of the parent clauses, because possible antecedents are 
unified, and in case of pronoun unification only the intersection of possible an- 
tecedents has to be considered. Now, two cases have to be distinguished. 

(i) M Pier. Because Picr is an instance of P^ and DUjQi, . . . , Qm} is strongly 
satisfiable, it holds that M ^ Dcr\{Picr}. But Dcr\{Picr} is a subset of the 
resolvent and therefore M [= Cct\{^Pi(t} U D(T\{Picr}. 
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(ii) M ^ Picr. Again, Picr is an instance of Pi and C U {^Pi, . . . , ^P„} is 
strongly satisfiable. Hence, it holds that M \= Ccr\{^Picr} and thereby 
M h C'(t\{-Pi(t}UD(t\{Pi(t}. □ 



Corollary 12 (Soundness). // prf (see Definition 9) produces the empty clause 
on input ~^Lp, then ip is valid. 

Proof. If we can derive □ from a set of clauses C, where C is the clause form 
of -'(/?, then we can show by induction that C is not strongly satisfiable, i.e., 
there is no model M such that M ^ C9 for all possible substitutions. Hence, 
for all models M, there is a disambiguation S, such that M ^ S{^ip), which is 
equivalent to M P, the definition of p being valid. □ 



Lemma 13. Let 0 be a total disambiguation of p, and assume that pO is unsat- 
isfiable. Then there is a (classical) resolution deduction of O from ipO. 



Lemma 14. Let 9 be a total disambiguation of p. Lf there is a resolution de- 
duction of □ from ip9, then prf generates the empty clause on input p. 

Proof. The idea of the proof is to turn the classical resolution proof of □ from 
S{p) into a labeled resolution proof of □ from the original formula p by repeating 
the resolution steps and inserting the required substitutions (i.e., partial disam- 
biguations) just before any steps where they were used in the original proof. 

Although the idea of this proof is simple, the details are too numerous to be 
included here. □ 



Corollary 15 (Completeness). Lf ip is valid, then prf generates the empty 
clause on input ~^p. 

4 Related Work 

Most work in the area of ambiguity and discourse semantics focuses on repre- 
sentational issues, but see [vEJ96,MdR98] for calculi for quantificational ambi- 
guities. Approaches that deal with pronoun binding are mostly trying to bind 
pronouns by applying some heuristics. The work that is closest to ours is the 
approach of Kohlhase and Konrad [KK98] who deal with pronoun binding in 
the setting of natural language corrections by using higher-order unification, 
and a higher-order tableaux method [Koh95] to reason about possible bindings. 
Van Eijck [vE98] presents a sequent calculus for DPL which deals with some of 
the complications we avoided in this paper; for instance multiple quantification 
of the same variable. Some of the ways in which dynamic updating can restrict 
possible pronoun bindings are considered in [Mon98] . 
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5 Conclusion 

In this paper we have presented a resolution calculus for reasoning with ambigu- 
ities triggered by pronouns and the different ways to bind them. Deduction steps 
and pronoun bindings are interleaved with the effect that only pronouns that 
are used during a derivation are bound to a possible antecedent. Labels allow 
us to capture relevant structural information of the original formula on a very 
local level, namely by annotating variables. Therefore structural manipulation, 
a prerequisite of any efficient proof method, does no harm. 

Our ongoing work focuses on two aspects. First, we have to see how our resolu- 
tion method behaves when other strategies restricting the search space are added; 
e.g., set-of-support strategy, ordered unification, or subsumption checking. Sec- 
ond, we are in the process of implementing the annotation and unification* 
algorithms and are trying to integrate them into a resolution theorem prover. 

Acknowledgment. The research in this paper was supported by the Spinoza 
project ‘Logic in Action’ at ILLC, University of Amsterdam. 
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Abstract. The importance of models within automated deduction is 
generally acknowledged both in constructing countermodels (rather than 
just giving the answer ”NO”, if a given formula is found to be not a 
theorem) and in speeding up the deduction process itself (e.g. by semantic 
resolution refinement). 

However, little attention has been paid so far to the efficiency of algo- 
rithms to actually work with models. There are two fundamental decision 
problems as far as models are concerned, namely: the equivalence of 2 
models and the truth evaluation of an arbitrary clause within a given 
model. This paper focuses on the efficiency of algorithms for these prob- 
lems in case of Herbrand models given through atomic representations. 
Both problems have been shown to be coNP-hard in [Got 97] , so there is 
a certain limit to the efficiency that we can possibly expect. Nevertheless, 
what we can do is find out the real ’’source” of complexity and make use 
of this theoretical result for devising an algorithm which, in general, has 
a considerably smaller upper bound on the complexity than previously 
known algorithms, e.g.: the partial saturation method in [FL 96] and the 
transformation to equational problems in [CZ 91]. 

The main result of this paper are algorithms for these two decision prob- 
lems, where the complexity depends non-polynomially on the number 
of atoms (rather than on the total size) of the input model equivalence 
problem or clause evaluation problem, respectively. Hence, in contrast 
to the above mentioned algorithms, the complexity of the expressions 
involved (e.g.: the arity of the predicate symbols and, in particular, the 
term depth of the arguments) only has polynomial influence on the over- 
all complexity of the algorithms. 



1 Introduction 

Models play an increasingly important role in automated theorem proving. Their 
applicability is basically twofold: Firstly, rather than just proving that some 
input formula is not a theorem, it would be desirable for a theorem prover to 
provide some insight as to why a given formula is not a theorem. To this end, 
the theorem prover tries to construct a countermodel rather than just giving the 
answer ”NO”. Consequently, over the past few years, automated model building 
has evolved as an important discipline within the field of automated deduction. 
The second application of models arises from the idea of guiding the proof search 
by providing some additional knowledge on the domain from which the input 
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formula is taken. This knowledge can be represented in the form of a model, 
which may then be used e.g. in semantic resolution. 

In any case, an appropriate representation of models is called for. The follow- 
ing properties are indispensable prerequisites for any practically relevant model 
representation: 

1. ’’reasonable” expressive power: A minimum requirement on the expressive 
power is the possibility to finitely represent infinite models. 

2. algorithms for two decision problems: There are two decision problems in- 
volved in the actual work with models, namely: deciding the equivalence of 
models and evaluating an arbitrary clause in such a model. 

In [FL 96] , atomic representations of Herbrand models (= AR’s) are shown to be 
a very useful formalism: On the one hand, it is proven that the above mentioned 
requirements are met and, on the other hand, an algorithm is presented, which 
allows the automatic model construction for satisfiable clause sets of a certain 
syntax class. 

However, the efficiency of algorithms concerning model equivalence and clau- 
se evaluation, which is an important issue for the practical applicability of a 
model representation, has received little attention so far. What we are inter- 
ested here is the (time) complexity of such algorithms for AR’s. Actually, both 
decision problems (i.e.: the model equivalence and the evaluation of a clause to 
true) have been shown to be coNP-hard even for the special case of linear atomic 
representations and even if the Herbrand universe contains no function symbols 
(cf. [Got 97]). Therefore, we cannot expect to find a polynomial algorithm with- 
out giving a positive answer to the P = A'P-problem. However, what we can 
do is find out the real ’’source” of complexity and make use of this theoretical 
result for devising an algorithm which is, in general, considerably more efficient 
than previously known ones, e.g.: the partial saturation method in [FL 96] and 
the transformation to equational problems in [CZ 91]. The main result of this 
work are algorithms for solving the model equivalence problem and the clause 
evaluation problem for AR’s, where the time complexity is non-polynomial only 
in the number of atoms. In contrast to the methods of [FL 96] and [CZ 91], 
the complexity of the expressions involved (e.g.: the arity of the predicate sym- 
bols and, in particular, the term depth of the arguments) only has polynomial 
influence on the complexity of our algorithms. 

This paper is organized as follows: After briefly revising some basic termi- 
nology in chapter 2, we shall provide algorithms for the two decision problems 
mentioned above, namely the model equivalence problem (in the chapters 3 and 
4) and the clause evaluation problem (in chapter 5), respectively: In chapter 
3, we present a transformation of the original model equivalence problem into 
another type of problem, which we shall refer to as the term tuple cover prob- 
lem, i.e.: Given a set M = {(tn, . . . , tik), ■ ■ ■, {tni, ■ ■ ■ , tnk)} of /c-tuples of terms 
over some Herbrand universe H. Is every ground term tuple (si, . . . , Sk) € 
an instance of some tuple flu, . . . , tik) G Ml An algorithm for the solution of 
the term tuple cover problem will be presented in chapter 4. The clause eval- 
uation problem for AR’s will be tackled in chapter 5. Again we first transform 
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the original problem to another kind of problem which we shall call the term 
tuple inclusion problem, i.e.: Given term tuple sets . . . , , 

j^(m) _ . . . , and M = Is every common i^-ground 

instance s of , . . . , also an instance of Ml (Note that we call a ground 
term tuple v an instance of a set of tuples W = {wi, . . . , w„}, iff there exists 
some Wi G W, s.t. v is an instance of Wi.) A solution for the term tuple inclu- 
sion problem is then provided by transforming it to a set of term tuple cover 
problems. Finally, in chapter 6 we shall summarize the main results of this pa- 
per and identify directions for future work. For reference, some related methods 
are briefly sketched and their complexity is compared with our algorithm in the 
appendix. 



2 Preliminary Definitions 

The following basic definitions from [FL 96] are also central to our considerations: 
An atomic representation of a Herbrand model (= AR) over some Herbrand uni- 
verse H is a set A = {Ai, , A„} of atoms over H with the following intended 
meaning: a ground atom over H evaluates to true, iff it is an instance of some 
atom Ai G A. In a, linear atomic representation (= LAR), all atoms are linear, 
i.e.: they have no multiple variable occurrences. Two AR’s A and B are equiva- 
lent, iff they represent the same (Herbrand) model, i.e.: the same ground atoms 
evaluate to true in both models. We say that a set C = {Ci, . . . , C„} of clauses 
H-subsumes a clause D, i.e.: {C\, . . ., C„} D, iff all H-ground instances of 
D are subsumed by some clause Ci G C. For a term t over H, we denote the set 
of H-ground instances of t by Gnit). 

The generalization of these concepts from simple terms to term tuples is 
straightforward, e.g.: By Gniti , . . . , tfe) we denote the set of H -ground instances 
generated by the term tuple (fi, . . . ,tk). 

3 Transformation of the Model Eqnivalence Problem 

In [FL 96] , the following criterion for the equivalence of AR’s is stated: 

Lemma 1. (H-subsumption criterion) Let A = {Ai,...,A„} and B = 
{Bi,...,Bm} be AR’s w.r.t. some Herbrand universe H. Then A and B are 
equivalent, iff {Ax, . . ., A„} Bj for every j G {!,..., m} and {Bi, . . ., Bm} 
Aj for every i G {1, . . . , n}. 

This characterization of model equivalence provides the starting point for our 
considerations. The following theorem shows how the H-subsumption criterion 
can be further transformed: 

Theorem 2. (transformation of the H-subsumption problem) Let B, 

Ai, . . ., An be atoms over some Herbrand universe H. Furthermore, let V (B) — 
{x\, . . ., Xk\ denote the variables occurring in B and suppose that V (B) C\V{Ai) 
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= 0 for all i € {1, . . . , n} (i.e.: B and the Ai ’s have no variables in common). 
Finally, let 0 denote the set of unifiers of pairs (Ai,B), i.e.: 

0 = {i? I (3i) s.t. Ai and B are unifiable with i9' = mgu(Ai, B) and = i?'|y(s)} 

Then {Ai, ..., An} B holds, iff U^gg, Gnixi-d, ..., Xk'd) = does. 

Proof (sketch) : B is H-subsumed, iff all ground instances are subsumed by some 
Ai. Obviously, only those instances of the A^’s play a role, which are unifiable 
with B, i.e.: For every ground substitution a, Ba must be subsumed by some 
Aitt'i = Bd[ = B§i, where = mgu{Ai, B) and di = 'd'i\v(B)- But this is the 
case, iff for every if -ground substitution ct, {x\a, . . .,Xk(x) is a ground instance 
of some {xx'di , . . . , Xk'di) with i)i G 0. O 

Remark: As far as complexity is concerned, the original model equivalence 
problem and the resulting collection of term tuple cover problems are basically 
the same: Both problems are coNP-hard, the number of (term tuple cover-) sub- 
problems and the number of term tuples within each subproblem are restricted 
by the number of atoms, the total length of each term tuple cover problem 
is restricted by the length of the original model equivalence problem (provided 
that an efficient unification algorithm is used, which represents terms as directed 
acyclic graphs rather than as strings of symbols), etc. 

4 The Term Tuple Cover Problem 

In this chapter we shall construct an algorithm which solves the term tuple cover 
problem. The target of such an algorithm is to transform a given term tuple 
set into a ’’particularly simple form”, for which it is easy to decide whether it 
represents a positive or negative instance of the term tuple cover problem. The 
following definition of ’’solved term tuple sets” makes this idea precise. 

Definition 3. (solved term tuple set) We call a term tuple set M solved, iff 
either M = 0 or M contains a tuple {x \, . . . , Xk) of pairwise distinct variables. 

Note that ii M — then no ground instance at all is covered by M and there- 
fore, M trivially represents a negative instance of the term tuple cover problem. 
Likewise, if M contains a tuple {x\, . . . ,Xk) of pairwise distinct variables, then 
M trivially represents a positive instance of the term tuple cover problem. 

The central idea of the transformation to solved term tuple sets is the fol- 
lowing: Divide the original problem (with n term tuples) into subproblems s.t. 
the number of term tuples in each subproblem is strictly smaller than n and 
the number of subproblems is bounded by n rather than by the total input length. 
Our algorithm will comprise two principal components, namely: A division into 
subproblems, which is based on an appropriate partition of the Herbrand uni- 
verse H, and redundancy criteria which control both the number and the size 
of the resulting subproblems. The basic partition of H is already well-known 
from previous algorithms, while the latter one is new. Moreover, the redundancy 
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criteria are the main reason why our algorithm is, in general, considerably more 
efficient than the other algorithms. 

The basic form of our division of a term tuple cover problem into subproblems 
will be based on the following partition of the Herbrand universe H, which is also 
central to the explosion rule for solving equational problems in [CL 89] and to 
the orthogonalization method in [FL 96]: Let FS{H) denote the set of function 
symbols of H (constants are considered as function symbols of arity 0). Then 
every ground term of H has exactly one / G FS{H) as leading symbol. Hence, 

H can be partitioned as F[ = U/GFS(if) . . . , Xa(f))^ ■ In the following 

theorem we use this partition of F[ to split a term tuple cover problem into 
subproblems: 

Theorem 4. (basic division into subproblems) Let H be some Herbrand 
universe whose set of function symbols is denoted by FS{H) (constants are con- 
sidered as function symbols of arity 0). Furthermore, let M = {ti, . . . , t„} be a 
set of term k-tuples over H and let p G {1, . . . , /c} denote a component of the 
tuples. For every f G FS{H) with arity a{f), we define the ’’subproblem” 
as follows: 

^ f ~ {(^il? ■ ■ ■ ? ■ ■ ■ ? C(p+1)? • • • j tik) \ tip = /(s^i, . . . , ^ia(/))} 

C {(Cij ■ ■ ■ j x ±, . . . , Xo,(^jp ■ ■ ■ , tif^’jc [ tip is a variable, the Xj 

are new pairwise distinct variables and a = {tip ^ f(xi , . . . , Xa(^f))}} 

Then M covers H^, iff covers for every f G FS{H). 

Proof: (sketch) Let p G {1, . . . , /c} be an arbitrary component of the fc-tuples. 
Then the above mentioned partition of H via the possible leading symbols of 
the terms in H can be generalized to the following partition of 

y HP-1 xGf(/(xi,...,x„(/))) 

feFS(H) 

Note that the tuples from M whose p-th component is a functional term with 
leading symbol g ^ f play no role in covering x Gh (^f{xi , . . . , Xa(f))^ x 

}jk-p jji order to test whether some term tuple set M covers x 

GH(^f{xi , . . . , Xa(f))'j X only the term tuples with a variable in the p-th 

component or with a functional term with leading symbol / have to be consid- 
ered. Moreover, from the tuples with a variable ^ in the p-th component, only 
the instance where ”z” is instantiated to the term f{xi , . . . , Xa(f)) for some new, 
pairwise distinct variables Xi is needed. 

By restricting the term tuple set M in this way, we get a set M' , where all 
tuples have a functional term with leading symbol / in the p-th component, i.e.: 

Af = , . . . , with tj = (til, . . . ,ti(^p—ip f{Sii, . . . , ^ia(/) )? C(p-t-l) ? ■ ■ • ’,tik). 

Then M' covers Hp~^ xGh (^f(xi , . ..,Xa(f))'j iff M" = {(Ui , . . .,L(p_i), 

Sii, . • • , Sia(/), L(p+i), . . . , tik) I ti e M'} covers But the latter condition 
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(p) 

corresponds to the term tuple subproblem Mj . O 

In the following example, we put this theorem to work: 

Example 5. (basic division into subproblems) Let H be the Herbrand universe 
with signature FS{H) = {/, g, a}. Furthermore, let an instance of the term tuple 
cover problem be given through the set M = {{f{x), y), {g{x), y), {x, x), {x, f{y)), 
{x,g{y))}. Then the subproblems = {{a),{f{y)),{g{y))}, = {{x,y), 

{x,f{x)),{x,f{y)),{x,g{y))} and = {{x,y),{x,g{x)),{x, f{y)), {x,g{y))} 

correspond to the requirement that M covers the subsets Gh{o) xiL, GH{f{x))x 
H and Gh{ 9 {x)) xiL, respectively, of 

In the above example, the division into subproblems was not problematical at all: 
Note that all of the 3 subproblems M^\ and M')p have strictly less term 
tuples than the original problem M . The reason why things ran so smoothly in 
example 5 is that two different function symbols (namely / and g) occurred as 
leading symbols of the terms in the first component. However, there is no guar- 
antee, that two different function symbols actually do occur in some component. 
It is, therefore, the purpose of this chapter to provide appropriate solutions for 
the cases where only one function symbol or none occurs. 

Both for the division into subproblems and for the redundancy criteria, we 
have to distinguish two cases, namely Herbrand universes with 2 or more function 
symbols of non-zero arity and Herbrand universes with only 1 such function 
symbol. Surprisingly enough, the latter case turns out to be much more difficult 
to handle than the former one. Moreover, we get a much better upper bound 
on the time complexity in the former case: By theorem 10, the term tuple cover 
problem over some Herbrand universe with 2 or more function symbols can be 
solved in time exponential in the number of term tuples while the upper bound 
obtained in theorem 12 for the latter case corresponds to the factorial of the 
number of term tuples. 



4.1 Two or More Function Symbols 

In this section we shall prove two redundancy criteria, which hold for any infinite 
Herbrand universe. Nevertheless, only in case of a Herbrand universe with 2 or 
more function symbols of non-zero arity, they are strong enough to allow the 
construction of an efficient algorithm. In section 4.2, we shall see that they do 
not suffice in case of only 1 such function symbol. 

It has already been mentioned that the splitting into strictly smaller subprob- 
lems according to theorem 4 requires that 2 different function symbols occur as 
leading symbols in the p-th component. Now suppose that all term tuples of M 
have a variable as p-th component. Then all tuples would have to be consid- 
ered in each subproblem. Actually, we do not worry about the case where the 
variables in the p-th component occur nowhere else in their tuple, since then 
we are on the right way towards the solved form from definition 3. However, if 
some variable Xip occurs more than once in the i-th tuple, then some action has 
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to be taken. The following theorem shows that we can handle this situation by 
deleting all tuples with a multiply occurring variable in the p-th component: 

Theorem 6. (redundancy criterion based on variable components) Let 

H be an arbitrary, infinite Herbrand universe and let M = {ti, . . . , t„} be a set of 
term k-tuples over H. Suppose that all terms occurring in thep-th component of 
the tuples from M are variables. Then every term tuple ti G M whose variable 
from the p-th component occurs somewhere else in ti is redundant and may, 
therefore, be deleted, i.e.: 

Let := {t, = {Ui,..., ti{p-i),x,tnji+i), . . . ,tik) G M\ s.t. X is a variable 
that occurs somewhere else inti}. Then M covers all of , iff M — M^pI does. 

Proof: We assume w.l.o.g. that p = 1 (since otherwise we would swap the p-th 
component with the first one). Suppose that M covers all of . Furthermore, 
let t = (ti, . . . , tfe) be an arbitrary ground term tuple which is covered by some 
ti G M^P\ We have to prove that t is also covered by some t^ G (M — M^p'>)\ 
Let d denote the term depth of t, i.e.: d — max({r(t.y) | 1 < 7 < k}) and 
choose an arbitrary term s G H with r(s) > d. Since we only consider the case 
of an infinite Herbrand universe H here, such a ground term s G H actually 
does exist. Then the term tuple s = {s,t 2 , . . ■ , tk) is also contained in and, 
therefore, covered by some tj = {x,tj 2 , . ■ . ,tjk) G M, i.e.: s = t^cr with a = 
{x ^ s} OT] for some -ground substitution p. 

We claim that G (M — M^pI). For suppose, on the contrary, that tj G M^p\ 
Then x also occurs in some component tjq of tj. Hence T{tjqa) > T{xa) = 
r(s) > d. But this contradicts the assumption that rftq) < d for all g > 2 and 
tj fJ — (s, t2, . . . ,t]f). 

Hence G {M - M^p^), i.e .: X occurs nowhere else in t^. But then we 
can substitute another term for the variable x without changing the remaining 
components tjqa with q > 2. Hence, t = (ti,...,tk) — tja' with a' = {x ^ 
ti} o ij. Therefore, t is also covered by G (M — M^p^). G> 

If only one function symbol / G FS{H) occurs as leading symbol of the p-th 
component then the subproblem M} , which corresponds to the condition that 
M covers Hp~^ x Gh (^f{xi , . . . , Xa{f))^ x H^~p, has the same number of term 
tuples as the original problem M. The following redundancy criterion shows, 
that these difficulties can be resolved by deleting all tuples with a non-variable 
term in the p-th component: 

Theorem 7. (redundancy criterion based on non- variable components) 

Let H be an arbitrary, infinite Herbrand universe and let f G FS{H) be a func- 
tion symbol of non-zero arity. Furthermore, let M — {ti, . . . , t„} be a set of term 
k-tuples over H . Suppose that there exist term tuples in M with a non-variable 
term in the p-th component but the function symbol f does not occur as leading 
symbol of any of these terms. Then every term tuple t G M with a non-variable 
term in the p-th component is redundant and may, therefore, be deleted, i.e.: Let 
M^P'i := {ti = (til, ■ ■ ■ ,Uk) G M\tip is a non-variable term}. Then M covers all 
of H^, iff M-M^P"! does. 
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Proof: Again we assume w.l.o.g. that p = 1 . Suppose that M covers all of . 
Furthermore, let t = (t\, . . . ,tk) be an arbitrary ground term tuple which is 
covered by some G M^p\ We have to prove that t is also covered by some 
tj G (M - 

ti G Hence, by the definition of M^p\ til is a non- variable term. There- 
fore, ti is a term with leading symbol g for some g G FS{H) s.t. g ^ f- Let 
d — max({r(t.y) | 1 < 7 < k}) denote the term depth of t and choose an arbitrary 
term s G H with r(s) > d. Then the term tuple s = (/(s, ■ ■ ■ , s),t2, ■ ■ ■ , tk) is 
also contained in and, therefore, covered by some tj = {tji, . . . ,tjk) G M. 
But s has a term with leading symbol / in the first component and, therefore, s 
cannot be covered by a term tuple from M^p\ Hence, G {M — M^p^). But then, 
tji is a variable x and, consequently, s = tja with a = {x ^ f{s , . . . , s)} o for 
some H-ground substitution g. 

Analogously to the proof of theorem 6, we can show that the variable x 
occurs nowhere else in For suppose on the contrary, that x also occurs in tjq 
for some q> 2 . Then T{tjqa) > T{xa) = T{f{s, . . . ,s)) > d. But this contradicts 
the assumption that r{tq) < d for all g > 2 and t^cr = (/(s , . . . , s),t2, ■ ■ ■ ,tk)- 
But then, (again like in the proof of theorem 6) we can substitute another 
term for the variable x without changing the remaining components tjqa with 
q > 2 . Hence, t = tja' with cr' = {x <— ti} o g. Therefore, t is also covered by 

Remark: The redundancy criteria from the theorems 6 and 7 allow the deletion 
of a subset M^p'^ of M by inspecting the p-th component of every tuple. Hence, 
for either criterion, we shall refer to the elements of M^p^ as the tuples redundant 
on p. Note that (for a fixed Herbrand universe H) these redundancy criteria can 
be easily tested in polynomial time (the latter one can be tested in quadratic 
time while, for the former one, linear time is sufficient). 

We are now ready to construct an algorithm for solving the term tuple cover 
problem for an arbitrary Herbrand universe with at least 2 function symbols 
of non-zero arity. In analogy with [CL 89], we shall use the notation and 
in order to refer to two different kinds of transformation rules, namely: 
rules which transform a term tuple set into another term tuple set or into a 
collection of term tuple sets, respectively. 

Definition 8. (transformation rules) We define a rule system consisting of 
the following three rules: 

1. V (= redundancy based on variable components): M M — M^p\ 
where p is a component in M which contains only variables and M^p^ = 
{ti = {til, • • • , ti(p-i),x, C(p+i), . . . , tik) G M I s.t. a: is a variable that occurs 
somewhere else in t^} denotes the term tuples which are redundant on p by 
the redundancy criterion of theorem 6. The rule ”V” may only be applied, 
if ^ 0. 

2. NV (= redundancy based on non-variable components): M —>nv M—M^p\ 
where p is a non-variable component in M s.t. there exists a function sym- 
bol / G FS{H) of non-zero arity which does not occur as leading symbol 
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of the p— th component of any tuple in M. Furthermore, = {t^ = 

(til, ■ ■ ■,tik) G M I tip is a non- variable term} denotes the term tuples which 
are redundant on p by the redundancy criterion of theorem 7. The rule ”NV” 
may only be applied, if yt 0. 

3. S (= splitting into subproblems): M 

provided that there are at least 2 different function symbols occurring as 
leading symbols in the p-th component. / G FS{H) is a function symbol in 
H (constants are considered as function symbols of arity 0) and denotes 
the subproblem from theorem 4, which corresponds to the requirement that 
M covers x Gnifixi, ■ ■ ■ ,Xa) x H^~p. 

In the following theorem we prove that the above rule system has the desired 
property of transforming an arbitrary term tuple problem into the solved form 
from definition 3. An upper bound on the (time) complexity of this transforma- 
tion will then be provided in theorem 10. 

Theorem 9. Let H be a Herbrand universe with at least 2 function symbols 
of non-zero arity. Then non- deterministic applications of the rule system from 
definition 8 to an arbitrary instance of the term tuple cover problem terminates 
with an equivalent collection of term tuple sets in solved form. 

Proof: For the correctness of the rule system we have to show, on the one 
hand, that every single rule is correct and, on the other hand, that the resulting 
term tuple sets are all in solved form, when no more rule is applicable: The 
correctness of every single rule has already been proven, i.e.: The rules ”V” and 
”NV” transform a term tuple set into an equivalent set by the theorems 6 and 
7. Likewise, by theorem 4, the rule ”S” replaces the original term tuple set by 
an equivalent collection of term tuple sets. 

Now suppose that Af is a collection of term tuple sets s.t. no more rule 
application is possible. If M G Af is empty then, by definition, M is solved. So 
suppose that M G AI is non-empty. We have to show that then M contains a 
term tuple t which consists of pairwise distinct variables only. To this end we 
exclude the following two cases: If no function symbol at all occurs in the tuples 
of M but none of the tuples consists of pairwise distinct variables, then there 
exists a tuple t G M with the same variable occurring in 2 different components 
p and q. Then the rule ”V” is applicable to the p-th component, since there are 
only variables in the p-th component of the remaining tuples and, therefore, t is 
redundant on p by theorem 6. Likewise, if there does a exist a function symbol in 
M, then M contains a tuple t with a functional term tp in the p-th component. 
Hence, either the rule ”NV” (if one function symbol of non-zero arity is missing 
as leading symbol in the p-th component) or the rule ”S” is applicable (if at least 
2 different function symbols occur as leading symbols in the p-th component). 
But this again contradicts the assumption that no rule is applicable to M. Note 
that this is the only place in the proof where we actually make use of the fact that 
H contains at least 2 distinct function symbols of non-zero arity, for otherwise 
it may happen that neither the rule ”NV” nor the rule ”S” is applicable to a 
component where some function symbol occurs. 
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The termination of non-deterministic applications of the above rules can be 
shown by a multiset argument based on the following observation: Whenever a 
rule ”V”, ”NV” or ”S” is applied, then the original term tuple set is replaced by 
a finite number ”c” of term tuple sets with strictly less term tuples. (In case of 
the rules ”V” and ”NV”, c = 1, whereas c = |F5'(i/)| in case of ”S”)- O 

Theorem 10. (complexity estimation) Let H be a Herbrand universe with 
at least 2 function symbols of non-zero arity. Then the term tuple cover problem 
can be decided in time 0{c^ *pol{N)) , where c = |F5'(H)|, n denotes the number 
of term tuples and pol{N) is some polynomial function in the total length N of 
an input problem instance. 

Proof: (sketch) In order to analyse the complexity of the rule system from 
definition 8, we have a closer look at the multiset argument in the termination 
proof of theorem 9: Whenever one of the transformation rules is applied to some 
term tuple set M , then this set M branches into at most c sets each containing 
strictly less term tuples. Hence, the whole transformation process corresponds 
to a tree whose depth is restricted by the number n of term tuples in the original 
set and where the degree of the nodes is restricted by c. But then the number 
of rule applications (which corresponds to the number of internal nodes of the 
tree) is restricted by c". O 



4.2 One Function Symbol 

As we have seen in the previous section, redundancy criteria are necessary to 
ensure that the original term tuple cover problem can be split into strictly smaller 
subproblems. However, the basic idea of partitioning the Herbrand universe could 
be carried over from previously known algorithms to ours without modifications. 
The following example illustrates that, in case of a Herbrand universe H with 
only 1 function symbol of non-zero arity, we even have to modify the way in 
which H is partitioned. 

Example 11. (unsuitable partition of H) Let M = {{p{x),y), {x,p{y)), (x, x), 
if{x), x),{x, f{x))} be an instance of the term tuple cover problem over the 
Herbrand universe H with signature FS{H) = {/, a}. 

Then M covers iff Ma = {{a, p{y)),{a,a),{a, f{a))} covers Gh(,o) x H 
and Mf = {{p{x),y),{f{x),f{y)), lf{x),f{x)),Cf{x),x),{f{x),p{x))} covers 
Gnifix)) X H. 

The subproblem = {{f{x),y),{f{x),p{y)),{f{x),f{x)),{f{x),x),{f{x), 

P{x))} from example 11 contains all tuples of M, where corresponds to 
the requirement that M covers Gnifix)) x H (cf. theorem 4). In section 4.1, it 
was possible to resolve this kind of situation by means of redundancy criteria. 
Note, however, that in the above situation we cannot hope to delete one of the 
tuples of M via a new redundancy criterion, since all of the 5 tuples of M are 
actually necessary to cover H^. Hence, in contrast to the previous section, we 
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even need a different partition of H in order to guarantee, that all subproblems 
have strictly less term tuples. 

Due to space limitations, we can only give the complexity result obtained for 
the case of a Herbrand universe with a single function symbol of non-zero arity. 
In [Pic 98], an extended version of this paper is available, where the algorithm 
and the required lemmas are worked out in detail. 

Theorem 12. (complexity estimation) Let H be a Herbrand universe with 
only 1 function symbol of non-zero arity and let n denote the number of term 
tuples. Then the term tuple cover problem can be decided in time 0(n! *pol{N)), 
where pol{N) is some polynomial function in the total length N of an input 
problem instance. 

5 The Clause Evaluation Problem 

5.1 Transformation of the Clause Evaluation Problem 

Analogously to the transformation of the model equivalence problem in chapter 
3, we shall transform the clause evaluation problem to an equivalent term tuple 
problem. However, the term tuple cover problem is not appropriate in this case, 
due to the existence of negative literals. Hence, the clause evaluation problem 
will be transformed into another type of problem which we shall call the term 
tuple inclusion problem, i.e.: Given term tuple sets . . . , nij^}, . . . , 

pj(m) _ . . . , and M = {ti,...,t„}. Is every common H-ground 

instance s of , . . . , also an instance of Ml (Note that we call a ground 
term tuple v an instance of a set of tuples W = {wi, . . . , w„}, iff there exists 
some Wi G W, s.t. v is an instance of Wi.) 

The following transformation based on mgu’s of atoms in a clause C and the 
atoms Ai of an atom representation A is a generalization of theorem 2: 

Theorem 13. (transformation of the clause evaluation problem) Let 

A = {Ax , . . . , An} be an AR of the model Ma over some Herbrand universe 
H and let C = Li V . . . V L; V ^Mi V ... V be a clause over H. Let 

V{C) = {xi,...,Xk\ denote the variables occurring in C and suppose that 
V{C) n V{Ai) = 0 for all i G {!,... ,n} (i.e.: C and the Ai’s have no vari- 
ables in common). Furthermore, let <Pj be defined as the set of unifiers of Mj 
with some Ai and let F be the set of unifiers of pairs (Ai,Lj), i.e.: 

Fj = {ip \ (3t) s.t. Ai and Mj are unifiable with ip' = mgu{Ai, Mj) and ip = 
t'\v(C)} (with l<j<m) 

F = {tp \ {3i){3j) s.t. Ai and Lj are unifiable with ip' = mgu{Ai,Lj) and 
V* = i>'\v{C)} 

Then C evaluates to T in Ma, iff (U,^e<z>i Gnix^ip, ..., Xkip)) n . . . n 
Gnixiip, ..., Xkip)) C Gh{xiiP, ..., Xk'p)) 

Proof (sketch): Analogously to the proof of theorem 2, the evaluation of ground 
atoms in Ai A can be expressed through appropriate conditions on term tuples 
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and their ff -ground instances, i.e.: Let a be an iL-ground substitution with 
domain V{C) and let = {xia, . . . ,Xkcr) be a ground term tuple in 

. Then the following relations hold: 

1. For every j with 1 < j < m: {h, Gnixiif , . . . , Xkif) Mja 

evaluates to T in 

2. Hence, G {\J^^^^GH{xiip, . . . ,Xkip))(l. . .n{\J^^^^GH{xiip, . . . , 

Xk^pj) Mia A ... A Mmcr evaluates to T in ->Micr V ... V -^M^a 

evaluates to F in Al _4 

3. (ti, . . . , tk) G Uy,Gi? Gnixitp , . . . , Xktp) there is some j with I < j < I, s.t. 
Lja evaluates to T in Ma ^ Lia V . . . V Lia evaluates to T in Ma 

The transformation into the term tuple inclusion problem is based on the fol- 
lowing idea: C evaluates to T all H-ground instances of C evaluate to T 
for every H-ground substitution cr: if all negative literals Mja of C a evaluate to 
F, then some positive literal Lja of Ca evaluates to T for every iL-ground 
term tuple t = {ti,...,tk) G If t G (Uc^6<i.i Gh{xhp, ■■■, Xk^p)) n . . . n 
Gh{xi(p, Xktp )) , then t G Gh{xi^P, . . . , Xkip). O 

In the special case where G contains no negative literals, then the empty inter- 
section is tested for inclusion, i.e.: C Gnixitp, ■ ■ ■ ,Xktp), which is 

equivalent to the term tuple cover problem Gnixitp , . . . , Xktp) = 

Remark: As far as complexity is concerned, the original clause evaluation prob- 
lem and the resulting term tuple inclusion problem are basically the same: Both 
problems are coNP-hard, the number of term tuples is restricted by the square 
of the number of atoms, the total length of each term tuple inclusion problem is 
restricted by the square of the length of the original clause evaluation problem 
(provided that an efficient unification algorithm is used, which represents terms 
as directed acyclic graphs rather than as strings of symbols) , etc. 



5.2 The Term Tuple Inclusion Problem 

In chapter 4, we have proven that the complexity of the term tuple cover prob- 
lem depends primarily on the number of term tuples. Consequently, the number 
of atoms was identified as the main source of complexity for the model equiv- 
alence problem. In order to derive a similar result for the term tuple inclusion 
problem and the clause evaluation problem, we transform a given term tuple 
inclusion problem into an equivalent collection of term tuple cover problems in 
the following way: 



1. Distributivity of H and U: In the term tuple inclusion problem for the sets 

f;(™) = {u|™\. 



eM = {u{ 



( 1 ) 



.( 1 ) 



}, 



, (m) 



}, M — {ti,...,t„}. 



the following kind of set inclusion has to be tested: 



(All U . . . u Ai„J n . . . n (A„1 u . . . u AmnJ C b, 
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where Aij = and B = (UlLi G'r/(t;). Then this intersection of 

unions can be transformed into an equivalent union of intersections by the 
distributivity of fl and U, i.e.: 

TL\ 

(AiiU. . .UAi„Jn. . .n(A^iU. . .UAmnJ = U- - u (Ai„,n...nA 

mOirn ) 

Q:i=l 0:771 = 1 

But a union of sets Ci is contained in another set B, iff every set Ci is. 
Hence, the original inclusion problem is equivalent to the following collection 
of simple inclusion problems: 

(VOi € (yCXrn Tlm\ )■ ^ ^ 

2. Intersection of sets Gniei): Let Gif(ei)n. . .riGniejn) with denote 

one of the intersections which result from the previous transformation step. 
Then this intersection can be represented as the set of ground instances of 
a single term tuple in the following way: We first rename the variables in 
the tuples e^, s.t. these tuples are pairwise variable disjoint. Then the set of 
common ground instances of these tuples can be computed via unification: 

(a) If ei, . . . , Bm are not unifiable, then Gniei) n . . . fi GHi^m) = 0- 

(b) If ei, . . . , Bfn are unifiable with mgu i], then Gniei) n . . . fl Gni'^m) = 
GHi^iri). 

But then every term tuple inclusion problem of the form 

n 

Gniei) n . . . n GH{^m) Q ([J Gh{^i) 

1=1 

can be either deleted (if ei, . . . , are not unifiable) or transformed into an 
inclusion problem of the form Gnieiv) ^ (UlLi 

3. H-subsumption Let s = Bitj denote one of the term tuples which are obtained 
by the previous step. Then the condition Gh{s) C IJ^L^ Gni'^i) corresponds 
to an H-subsumption criterion for term tuples, namely: {ti, . . .,t„} s. 
Now suppose that s and the t^’s have no variables in common. Furthermore, 
let V'(s) = {xi, . . . ,xi} denote the variables occurring in s and let 0 be 
defined as the set of unifiers of pairs (ti,s), i.e.: 

G = {r? I s.t. ti and s are unifiable with = mgu{ti, s) and r? = r?'|y(s)} 

Then, analogously to theorem 2, this H-subsumption criterion can be trans- 
formed into the term tuple cover problem Gh{xi'&, . . . , Xi^) = 

By combining the above transformation with the complexity results for the term 
tuple cover problem from chapter 4, we arrive at an upper bound on the com- 
plexity of our term tuple inclusion problem. Due to space limitations, we only 
consider the case of a Herbrand universe with 2 or more function symbols of 
non-zero arity below: 
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Theorem 14. (complexity estimation) Let H be a Herbrand universe with 
at least 2 function symbols of non-zero arity. Furthermore, let V be an instance 
of the term tuple inclusion problem (over H), where c = |F5'(i?)|, n denotes 
the number of term tuples and pol{N) is some polynomial function in the total 
length N of V. Then the term tuple inclusion problem for V can be solved in 
time 0(c^ *pol{N)). 

Proof: (sketch) If we solve the term tuple inclusion problem by first trans- 
forming it into an equivalent set of term tuple cover problems, then the over- 
all complexity of this algorithm is mainly determined by the number of term 
tuple cover problems and by the cost of solving each term tuple cover prob- 
lem. Let V — be an instance of the term tuple inclusion 

problem with \M\ = n' and -I- ... -I- = n" . Then the number of 

term tuple cover problems obtained by the above transformation corresponds 
to * ... * which (for fixed n") becomes maximal, if all sets 

consist of 3 elements. Hence, the number of term tuple cover problems is re- 
stricted by 3 s' . On the other hand, the number of term tuples in a single term 
tuple cover problem is restricted by \M\ = n' . Furthermore, the total size of 
every term tuple cover problem is linearly restricted by the total size of the 
original term tuple inclusion problem, provided that we use an efficient unifica- 
tion algorithm. Hence, together with theorem 10, we get the complexity bound 
0(3^” *pol{N)) < *pol{N)) = 0{c^ *pol{N)) for the term tuple 

inclusion problem. O 



6 Concluding Remarks and Future Work 

From the theoretical point of view, the number of atoms (rather than the total 
length of the input problem) has been identified as the real complexity ’’source” 
both of the model equivalence problem and the clause evaluation problem of 
AR’s. From the practical point of view, the foundation has been laid for consid- 
erably more efficient algorithms than previously known ones. The main reason 
for this improvement are the various redundancy criteria proven in chapter 4. 

In this paper, we have mainly concentrated on the worst case complexity of 
the algorithms under investigation. However, in practice, also heuristics concern- 
ing the rule application strategy and further simplification rules play a crucial 
role, even if they do not affect the worst case complexity. One possible simplifi- 
cation would be to delete all term tuples from a term tuple set M, which are 
an instance of some other tuple tj G M. The search for further improvements of 
this sort as well as any implementational details have been left for future work. 

Remember that in chapter 1, the close relationship between atomic represen- 
tations and constraint solving was already mentioned. In particular, the problems 
of model equivalence and clause evaluation can be first transformed into equa- 
tional problems and then tackled by constraint solving methods. On the other 
hand, one may try to go the other direction and find out in what way the ideas 
presented in this paper (in particular, the redundancy criteria) can be applied 
to constraint solving, e.g.: in [LM 87] or [CL 89]. 
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This work only deals with atomic representations of models. However, many 
more formalisms for representing models can be found in the literature (cf. 
[Mat 97]). Like in the case of AR’s, no particular emphasis is usually put on 
the efficiency of algorithms to actually work with these formalisms. Hence, a 
thorough complexity analysis and the search for reasonably efficient algorithms 
would be also desirable for other model representations. 
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Appendix 

A An Overview of Related Methods 

It has already been mentioned that, for the decision problems treated in this 
work, algorithms are also provided in [CZ 91] and [FL 96]. In both papers, an 
algorithm for deciding the H-subsumption problem forms the basis both for solv- 
ing the model equivalence problem and the clause evaluation problem. Further- 
more, the way the H-subsumption problem is treated is decisive for the overall 
complexity of these algorithms. Therefore, in this chapter, we shall concentrate 
on the H-subsumption problem when the main ideas of these algorithms are 
outlined and their complexity is compared with our algorithm: 

The H-subsumption algorithm in [FL 96] is based on the following theorem: 
Let C and T> be sets of clauses s.t. the minimum depth of variable occurrences 
in T> is greater than the term depth of C, then the H-subsumption and ordinary 
subsumption coincide. Hence, the H-subsumption problem A B for an atom 
set A and an atom B can be decided as follows: First, B is transformed into an 
equivalent atom set B by partial saturation, s.t. the minimum depth of variable 
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occurrences in B is greater than the term depth of A. Then, it is tested whether 
every atom from B is an instance of some atom from A. 

If the condition on the minimum depth of variable occurrences is already 
fulfilled by B, then the algorithm from [FL 96] is polynomial. However, in the 
worst case, the partial saturation leads to a set B with exponentially many atoms. 
If the signature contains at least one function symbol of arity greater than or 
equal to 2 then even the size of the atoms in B may become exponential. In any 
case, the exponentiality both of the time and space complexity refers to the size 
of the atoms of the original H-subsumption problem rather than to the number 
of atoms. 

In [CZ 91], the H-subsumption problem {P(ti), . . . , P(t„)} P(s) is reduced 
to the equational problem 



(Vyi : ti 7 ^ s) A . . . A (Vy„ : t„ ^ s). 



where y^ denotes the vector of variables in t^. The satisfiability of this equational 
problem is then tested by the method from [CL 89]. The rules in [CL 89] may 
be applied non-deterministically. At any rate, in the absence of redundancy 
criteria similar to ours, the only way to deal with the universally quantified 
variables yi is to first transform arbitrary disequations containing universally 
quantified variables into disequations of the form yi ^ t, s.t. yi does not occur 
in t. Then such a variable yi can be eliminated by the transformation rule U 2 
(= ’’universality of parameters”) from [CL 89], i.e.: 

Wy: P A{y,^tAR) Vy : P A R[y, ^ t]. 

In order to eliminate all occurrences of universally quantified variables at a depth 
greater than 0, the explosion rule has to be applied repeatedly. The idea behind 
this transformation is illustrated by the following example: 

The problem Vy : /(/(y)) ^ x over the signature FS{H) = {/, a} is equiv- 
alent to the disjunction of the problems 3u : x = f{u) A (Vy : f{f{y)) ^ x) 
and X = a A (Vy : f{f{y)) ^ x). These problems can be simplified to 3x : 
X = f{u) A (Vy : /(y) 7 ^ u) and x = a, respectively. Then the depth of occur- 
rence of the universally quantified variable y has been strictly decreased in all 
subproblems. 

In our original equational problem (Vyi : ti 7 ^ s) A . . . A (Vy„ : t„ 7 ^ s), 
this idea of eliminating all occurrences of universally quantified variables at a 
depth greater than 0 has to be applied to every disequation 7 ^ s. The num- 
ber of explosion rule applications required depends linearly on the number of 
non- variable positions in the term tuple t^. We therefore get the upper bound 
0 ((c*to)" *pol{N)) for the worst case time complexity of this equational prob- 
lem solving method, where c is a constant, m is an upper bound on the size of 
the tuples and n denotes the number of tuples. 

In [LM 87], a problem strongly related to our term tuple cover problem is tack- 
led, namely: Let an ’’implicit generalization” over some Herbrand universe be 
given ast/{t9i, . . . , with the intended meaning that it represents all ground 




Algorithms on Atomic Representations of Herbrand Models 215 



instances of t which are not an instance of any t9i . Apart from some other pur- 
poses like the transformation of an ’’implicit generalization” into an equivalent 
’’explicit” one (for details, cf. [LM 87]), the algorithm from [LM 87] is used to de- 
cide whether such a generalization is empty. Note that the original version of this 
algorithm can be easily extended to term tuples t and t6i. Furthermore, the term 
tuple cover problem M = {ti,...,t„)} corresponds to the emptiness problem 
for the ’’implicit generalization” x/{ti,...,t„)}, where x = {x\, . . . ,Xk) is an 
arbitrary fc-tuple of pairwise distinct variables. Then the H-subsumption prob- 
lem can be solved as follows via the ’’uncover” -algorithm from [LM 87]: Choose 
some linear term tuple from the right hand side of the implicit generalization 
and perform a partitioning of s.t. one partition corresponds to the ground 
instances of (The idea of this partitioning basically comes down to iterated 
applications of the explosion rule from [CL 89]). Let P denote the set of terms 
generated by the partitioning algorithm from [LM 87] and let P' = P — {t^}. 
Then the original implicit generalization is equivalent to the following collection 
of generalizations: 



{p/{mgi(p, ti), . . . , mgi(p, t,_i), mgi(p, t,+i), . . . , mgi(p, t„)} | p G P'}, 



where ”mgi” denotes the most general common instance of two term tuples. The 
number of tuples on the right hand side is strictly decreased on every recursive 
call of the procedure ” uncover” . If eventually an implicit generalization is pro- 
duced s.t. the right hand side is empty or contains only non-linear term tuples, 
then this generalization (and, hence, also the original one) is found to be non- 
empty. If, on the other hand, the whole set of generalizations eventually becomes 
empty, then the original generalization actually is empty. 

Note that the number of terms in the partitioning set P depends linearly on 
the number of non-variable positions in the term tuple t^. Hence, analogously 
to the equational problem solving method from [CL 89], we get the the upper 
bound 0((c*to)" *pol{N)) for the worst case time complexity of the ’’uncover”- 
algorithm, where c is a constant, m is an upper bound on the size of the tuples 
ti and n denotes the number of tuples. Even though the two algorithms from 
[CL 89] and [LM 87] behave very similarly in the worst case, the termination 
criterion of the latter algorithm provides a significant improvement, i.e.: rather 
than eliminating all tuples from the right hand side of an implicit generalization, 
it suffices to eliminate the linear tuples only. 

In contrast to our algorithm, there exists no constant c s.t. the worst case com- 
plexity of the above mentioned algorithms can be bounded by 0(c" *pol{N)). In 
particular, when expressions of high term complexity are involved, the advantage 
of our algorithm w.r.t. the others is obvious. Moreover, several ideas presented 
in the other algorithms, can be easily incorporated into our algorithm, e.g.: The 
termination criterion from [LM 87] can be used to extend our notion of solved 
forms from definition 3 in the following way: A term tuple set M will be called 
solved, iff either M contains no linear tuple or M contains a tuple {x\, . . . , Xk) 
of pairwise distinct variables. 
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Abstract. This paper concentrates on comparing the relative expressive 
power of five non-monotonic logics that have appeared in the literature. 
The results on the computational complexity of these logics suggest that 
these logics have very similar expressive power that exceeds that of clas- 
sical monotonic logic. A refined classification of non-monotonic logics by 
their expressive power can be obtained using translation functions that 
satisfy additional requirements such as faithfulness and modularity used 
by Gottlob. Basically, we adopt Gottlob’s framework for our analysis, but 
propose a weaker notion of faithfulness. A surprising result is deduced 
in light of Gottlob’s results: Moore’s autoepistemic logic is less expres- 
sive than Reiter’s default logic and Marek and Truszczyriski’s strong 
autoepistemic logic. The expressive power of priority logic by Wang et 
al. is also analyzed and shown to coincide with that of default logic. 
Finally, we present an exact classification of the non-monotonic logics 
under consideration in the framework proposed in the paper. 



1 Introduction 

A variety of non-monotonic logics have been proposed as formalizations of non- 
monotonic reasoning (NMR). Among these formalizations are circumscription 
by McCarthy [19], default logic by Reiter [24], autoepistemic logic by Moore [20], 
strong autoepistemic logic by Marek and Truszczyhski [17] as well as priority 
logic by Wang, You and Yuan [28]. The main goal of this paper is to compare 
these five non-monotonic logics on the basis of their expressive power, i.e. their 
capability of representing various problems from the NMR domain. 

A way of measuring the expressive power of a non-monotonic logic is to an- 
alyze the computational complexity of its decision problems, and to rank these 
decision problems in the polynomial time hierarchy (PH) [1]. In fact, complexity 
issues have received much attention in the NMR community recently, and the de- 
cision problems of default logic, (strong) autoepistemic logic and circumscription 
have been systematically analyzed [4,5,8,18,21,26]. To summarize these results in 
the propositional case, the major decision problems of these four non-monotonic 
logics are complete problems on the second level of PH. These complexity re- 
sults suggest that (i) the expressive powers of non-monotonic logics exceed that 
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of classical monotonic logic and (ii) the non-monotonic logics mentioned are of 
equal expressive power - if measured by the levels of PH. 

The expressibility issue can also be addressed in terms of translation functions 
between theories of non-monotonic logics. For instance, a variety of translation 
functions have been proposed to transform a default theory into an autoepis- 
temic one [2,9,12,13,17,25,27] and back [11,18] such that sets of conclusions are 
preserved to a reasonable degree in the translation. Also, translation functions 
between various kinds of default theories have been considered [2,6,15]. In fact, 
complexity results are based on polynomial transformations between decision 
problems of non-monotonic logics and thus give rise to translation functions be- 
tween non-monotonic theories, too. Unfortunately, the aim of such transforma- 
tions is to preserve the yes/no-answers of decision problems and nothing more. 
This leaves room for translations that depend globally on the theory under trans- 
lation so that a local modification to the theory changes the translation totally. 
However, it is possible to introduce further constraints for translation functions. 
A very promising requirement - modularity - is introduced by Imielinski [10] 
and then used by Gottlob [9] and Niemela [23]. Roughly speaking, a modular 
translation function is in a sense systematic: local changes in a theory cause only 
local changes in its translation. Most importantly, it has been shown that a mod- 
ular translation function between certain non-monotonic logics is not possible. 
This indicates that the non-monotonic logics involved differ in expressive power, 
although their decision problems are equally complex. 

This paper takes the expressive power of five non-monotonic logics into recon- 
sideration. A central concept - the notion of a polynomial, faithful and modular 
translation function - is adopted from Gottlob’s work [9]. However, the notion 
of faithfulness is revised in an important way: it is assumed that sets of con- 
clusions are preserved up to a fixed propositional language. This allows one to 
add, e.g., new propositional atoms in a translation if necessary and this is indeed 
the case with a number of translation functions addressed in this paper. More- 
over, it is shown that polynomial, faithful and modular translations do not exist 
in certain cases in order to establish strict differences in expressive power. The 
comparisons made in the paper lead to an exact classification of non-monotonic 
logics. A particular novelty in this respect is that Moore’s autoepistemic logic is 
less expressive than Reiter’s default logic. Gottlob [9] employs a stronger notion 
of faithfulness and concludes the opposite. This demonstrates in an interesting 
way how the requirements imposed on translation functions affect the results on 
expressibility. Also, new light is shed on the interconnection of priority logic and 
default logic by showing that these logics are of equal expressive power. 

The plan of this paper is as follows. In Section 2, we review the basic notions of 
non-monotonic logics mentioned. After this, the criteria for translation functions 
are set up in Section 3. In Section 4, actual translation functions are presented to 
rank non-monotonic logics by their expressive power. Some comparisons are also 
made with related work. Finally, the resulting classification of non-monotonic 
logics by their expressive power is illustrated and discussed in Section 5. 
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2 Logics of Interest 

In this section, we review the basic definitions and notions of non-monotonic 
logics [14,17,20,24,28] that appear in the rest of this work. To allow a uniform 
treatment of these logics in sections to come, we have made the definitions simi- 
lar as follows, (i) Only the propositional case is considered. Definitions are given 
relative to a propositional language £ which is based on a finite or at most 
countable set of propositional atoms A. (ii) A propositional subtheory T C £ is 
distinguished for each non-monotonic theory, i.e. a theory of a non-monotonic 
logic, (iii) The sets of conclusions associated with non-monotonic theories are 
identified. Such sets are often called extensions or expansions and they determine 
the semantics of non-monotonic theories. Generally speaking, a non-monotonic 
theory may have a unique extension, several extensions, or sometimes even no 
extensions. We consider both brave and cautious reasoning strategies with exten- 
sions. In the former strategy, finding a single extension for a theory is of interest, 
while the intersection of extensions is considered in the latter. 

2.1 Default Logic (DL) 

A default theory [24] is a pair {D, T) where T C £ and D is a, set of default rules 
(or defaults) of the form a : /3i, . . . , /3„/7 such that n > 0 and the prerequisite 
a, the justifications /3i, . . .,/?„ and the consequent 7 of the rule are sentences 
of £. Marek and Truszczyhski [18] reduce a set of defaults D with respect to 
a propositional theory £ C £ to a set of inference rules De which contains an 
inference rule a/7 whenever there is a default rule a : (i\, . . . , (dnH G D such 
that EU {Pi} is consistent for all 0 < f < n. We need also the closure of a theory 
T C £ under a set of inference rules R, denoted by Cn^(T), which is the least 
theory E C £ satisfying (i) T C E, (ii) the set of propositional consequences 
Cn(£) = {(j)€£\E\=(j)}CE and (iii) {7 | a/7 G R and a € E} C Ef The sets 
of conclusions associated with a default theory {D, T) are defined as follows. 

Definition 1 (Marek and Truszczynski [18]). A theory £ C £ is an exten- 
sion of a default theory (£, T) if and only if £ = Cn'°® (T). 

2.2 Autoepistemic Logic (AEL) 

An autoepistemic language £3 is the unimodal extension of £ with a modal 
operator B for beliefs whereas an autoepistemic theory £ C £3 [20]. Sentences 
of the form B(/ are known as belief atoms and the set of logical consequences 
Cn(£) C £3 is defined in the standard way by treating belief atoms as additional 
propositional atoms. In this paper, a pair of theories (£, T) where T C £ is a 
propositional theory is also called an autoepistemic theory (then £UT is a theory 
in Moore’s sense) . Moore’s idea is to capture the sets of beliefs A of an ideal and 
rational agent which believes exactly the logical consequences of £ C £3 and its 
beliefs BZ\ = {B(/| p G A} and disbeliefs ^BZ\ = {^B(/| p G £3 — A} obtained 
by introspection. Such sets of beliefs/conclusions are called stable expansions. 



^ Marek and Truszczyhski [18] propose a proof system to capture this closure. 
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Definition 2 (Moore [20]). A theory Z\ C £b is a stable expansion of an 
autoepistemic theory S C £q if and only if Z\ = Cn(A' U BZ\ U ^BZ\). 

2.3 Strong Autoepistemic Logic (SAEL) 

Theories of strong autoepistemic logic [17] are similar to those of AEL. Given 
A C £b, we let CnB(T') denote the closure of S under propositional inference 
and the standard necessitation rule: from ip infer Bp. More formally, CnB(T') is 
the least theory A C Cb satisfying (i) E C A, (ii) Cn(Z\) C A and (iii) BZ\ C 
This leads to the definition of iterative expansions below. Iterative expansions of 
E C £b are also stable expansions of E, but not necessarily vice versa [17], and 
thus E is assigned a different semantics under iterative expansions. 

Definition 3 (Marek and Truszczynski [17]). A theory Z\ C £b is an iter- 
ative expansion of H C £b if and only if Z\ = CnB(T' U ^BZ\). 

2.4 Parallel Circumscription (CIRC) 

We present a generalization of McCarthy’s approach [19], namely parallel cir- 
cumscriptionhy Lifschitz [14]. A minimal model theory is a triple (P, P, T) where 
P Q A and F Q A are mutually disjoint sets of atoms and T C £. The idea be- 
hind parallel circumscription [14] is to distinguish propositional models A4 that 
are minimal in the sense of Definition 4: as many atoms of P should be false 
in A4 as possible. Note that the atoms of F remain fixed in the minimization 
process while the atoms in A — (P U P) may vary freely. 

Definition 4 (Lifschitz [14]). A propositional model Ai C A of T C £ is 
(P, P)-minimal if and only if there is no propositional model Ai' Q A oiT such 
that At' n P = At n P and At' n P C At n P. 

There is no explicit notion of extensions involved in parallel circumscription, 
but let us propose an implicit one. Given a propositional model At, we let 
True(Al) denote the theory {p & £ \ AA ^ p}. A (P, P)-minimal model At gives 
rise to a (P, P)-extension P C £ of T which is the intersection of the theories 
True(Al') for all (P, P)-minimal models AA' of T such that AA' n (P U P) = 
At n (P U P). In this setting, the (P, P)-minimal models of T are divided into 
(equivalence) classes that give rise to (P, P)-extensions. Obviously, there may 
be several models in one class, since the atoms in A — (P U P) may vary freely. 
What comes to the cautious reasoning strategy, the correspondence of (P, P)- 
extensions and (P, P)-minimal models given in Proposition 5 is straightforward 
to establish. The notion of (P, P)-extensions proposed is also appropriate if the 
brave^ reasoning strategy is used in conjunction with parallel circumscription. 

Propositions. Given a minimal model theory (P,F,T), the intersection of 
(P, F) -extensions of the theory T C £ coincide with the intersection of the the- 
ories True(Al) for all {P, F) -minimal models of T . 

^ Gottlob’s B-proofs [8] capture this closure. 

® Note, e.g., that Eiter and Gottlob [5] consider the complexity of propositional cir- 
cumscription according to the cautious strategy only. 
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2.5 Priority Logic (PL) 

A theory of priority logic [28] is a triple (i?, P, T) where P is a set of (monotonic) 
inference rules^ in C, P C R x R gives a priority relation among the rules of 
R and T C C. The idea behind prioritisation of inference rules is that if two 
rules Ti and T 2 from R are in the relation P (denoted by ri ^ T 2 in [28]), then 
the application of the rule r^ blocks that of r\ and the rule T 2 has a higher 
priority than ri in this sense. For a set of rules R and a theory E C £, we 
write App(P, E) to denote the set {a /7 G R \ E \= a} which contains the rules 
of R that are applicable given E. To interpret the priority relation P, we define 
Nb(P, P, R') C P as the set of rules r G R which are not blocked given that the 
rules of P' are applicable, i.e. there is no r' G R' such that (r, r') G P. These 
notions suffice to define extensions for a priority theory {R,P,T). The set of 
rules R' in the definition is called a stable argument by Wang et al. [28]. 

Definition 6. A theory P C £ is an extension of a priority theory (P, P, T) if 
and only if P = Cn'^ (T) for R' <G R satisfying R' = App(Nb(P, P, P'), P). 

3 Requirements Imposed on Translation Functions 

From now on, we restrict ourselves to finite theories of non-monotonic logics 
introduced in Section 2. In this section, we introduce the basic requirements for 
translation functions that map a theory of one non-monotonic logic to a theory 
of another. The requirements will be named as polynomiality, faithfulness and 
modularity. In the forthcoming mathematical formulations of these requirements, 
we let {X, T) stand for a non-monotonic theory where T is its propositional 
subtheory and X stands for any set(s) of syntactic elements which are specific 
to the non-monotonic logic in question (such as a set of defaults D in DL). The 
non-monotonic theories introduced in Section 2 are clearly of this form. Our 
first requirement involves the length of a non- monotonic theory {X, T), denoted 
by 1 1 {X, T) 1 1 , which is the number of symbol occurrences needed to represent 
(A,T). 

Definition 7 (Polynomiality). A translation function Tr is polynomial, iff for 
all (A, T), the time required to compute Tr((A, T)) is polynomial in ||(A, T)||. 

To give an example of a such a function, we introduce a linear function that 
transforms a minimal model theory into an autoepistemic one. 

Definitions (Niemela [22]). For all minimal model theories (P, P, T), let 
TrN((P, P, T)) = (I^Ba — > “la | a € P U P} U {^B—ia — > a | a € P}, T). 

The next question is whether a translation function Tr preserves the seman- 
tics of a non-monotonic theory {X, T) . We have used the following criteria to 

Wang et al. [28] use rules of the form 7 ^ oi, . . . ,a„ with multiple prerequisites, 
but such rules can be represented as ai A ... A an/7 under propositional closure. 
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formulate our forthcoming definition, (i) Since the semantics of (JA, T) is deter- 
mined by its extensions and both brave and cautious reasoning strategies should 
be supported, a one-to-one correspondence of extensions is a natural solution, 
(ii) Only propositionally consistent extensions are taken into account in this 
one-to-one relationship, because we have in mind translation functions (such as 
Tri in Definition 15) whose faithfulness depends on this restriction, (iii) More- 
over, we are assuming that the language £ of T is a fixed propositional language 
which is used for knowledge representation in a given domain. The propositional 
languages associated with {X, T) and Tr((X, T)) may extend C, but we project 
the extensions of these theories with respect to C. In particular, this means that 
a translation function can add new atoms, but within the bounds of our poly- 
nomiality requirement. This seems a crucial option in order to support different 
kinds of knowledge representation and reasoning techniques. For instance, the 
translation function Tcn introduces belief atoms for these reasons. 

Definition 9 (Faithfulness). A translation function Tr is faithful, iff for all 
(A, T), the propositionally consistent extensions of (A, T) and Tr((A, T)) are in 
one-to-one correspondence and coincide up to the propositional language L of 
T. 



This definition ensures that given a faithful translation function Tr, any brave 
or cautious conclusion (j) G C obtained from (A, T) can also be obtained from 
the translation Tr((A, T)), and vice versa. Note that this presumes that only 
propositionally consistent extensions are taken into account in the brave strategy. 
Let us yet point out that our notion is useful only if a notion of extensions is 
available for the (non-monotonic) logics involved. Fortunately, this is the case 
with logics addressed in this paper. Our notion of faithfulness is also closely 
related to the one by Gottlob [9] . The differences are that Gottlob does not allow 
new atoms to be introduced in a translation and he takes also the propositionally 
inconsistent extensions into account. Gonsequently, a translation function that 
is faithful in Gottlob’s sense is also faithful in our sense. The converse does not 
hold in general - which is to be demonstrated in Theorem 16 and Example 23. 

By Niemela’s results [22] and the notion of {P, F)-extensions proposed in 
this paper, the translation function introduced in Definition 8 is faithful: the 
(P, P)-extensions of T and the propositionally consistent stable expansions of 
Tcn((P, P, T)) are in a one-to-one correspondence and coincide up to the lan- 
guage £ of T. An inconsistent stable expansion A — £^ appears only if T is 
inconsistent and there are no (P, P)-extensions. Our last requirement follows. 

Definition 10 (Modularity, Gottlob [9]). A translation function Tr is mod- 
ular, iff for all (A, T), Tr((A, T)) = (A', T' U T) where (A', T') = Tr((A, 0)). 

Our modularity requirement is a generalization of the one that Gottlob for- 
mulated for translations from DL into AEL [9] . In particular, a modular trans- 
lation function provides a fixed translation for A (i.e. the non-monotonic theory 
Tr((A, 0))) which is independent of T. Therefore, if T is updated, there is no 
need to recompute the fixed part Tr((A, 0)) in order to compute Tr((A, T)). 
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Note also that the translation function Tcn of Definition 8 is modular in this 
sense, because P and F are translated into a fixed autoepistemic theory. 

For the sake of brevity, we say that a translation function is PFM if it 
satisfies the three requirements set up in Definitions 7, 9 and 10. A fundamental 
property of PFM translations is pointed out in the following. 

Proposition 11. A composition of PFM translation functions is also PFM. 

PFM translation functions provide us the basis for analyzing the relative 
expressive power of non-monotonic logics. The motivation for this is that if 
there is a PFM translation function that maps theories of a non-monotonic logic 
Li to theories of a non- monotonic logic L 2 , then we consider L 2 to be at least 
as expressive as L\ . This gives rise to a preorder among (non-monotonic) logics. 
For instance, AEL is at least as expressive as CIRC, because Tcn is PFM. If - in 
addition- there are no PFM translation functions in the opposite direction, then 
we say that Li is less expressive than L 2 ■ If there are PFM translation functions 
in both directions, then L\ and L 2 are of equal expressive power. As concluded 
by Gottlob [9] , this view identifies the expressive power of non-monotonic logics 
with their capability of representing different propositional closures in C. 

As a final issue in this section, we compare our approach with another by 
Gogic, Kautz, Papadimitriou and Selman [7]. They propose a framework for 
analyzing the succinctness of knowledge representation (i.e. the space required 
in knowledge representation) and thus also the expressive power of formalisms 
involved, but different kinds of translation functions are used, (i) Gogic et al. 
use a different polynomiality requirement: the length of the translation has to 
be polynomial in the length of the theory. This allows even exponential compu- 
tations to obtain a translation as long as only a polynomial blow-up results in 
the translation. Our requirement restricts the translation time and thus also the 
translation space to be polynomial, (ii) Gogic et al. formulate their notion of 
faithfulness as a requirement that the propositional models of the theory under 
translation are preserved. If a translation function is faithful in our sense, then it 
is in their sense, too, provided that models of extensions are taken into account 
up to £. The converse does not hold in general, since our notion of faithfulness 
presumes that a notion of extensions is available for the non-monotonic logics 
involved, (iii) Gogic et al. do not employ a modularity requirement. 



4 Classifying Non-monotonic Logics 

Having set up the notion of a PFM translation function, such translation func- 
tions are exhibited in this section in order to classify non-monotonic logics by 
their expressive power. Moreover, counter-examples are provided to show that 
such translations are not possible in certain cases. Such non-equivalence proofs 
have already been devised for non-monotonic logics by Imielinski [10], Gottlob 
[9] and Niemela [23]. In the forthcoming subsections, we perform a pairwise com- 
parison of non-monotonic logics in the following order: GIRG, AEL, SAEL, DL 
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and PL. But as a starter, we relate classical propositional logic (CL) with CIRC. 
This result is supported by other complexity and intranslatability results [3,4,5]. 

Theorem 12. CL is less expressive than CIRC. 

Proof. A propositional theory T C C is translated into a minimal model theory 
Tro(T) = (0,0, T) having the same propositional language £. The only (0,0)- 
extension of T is Cn(T), i.e. the natural “extension” of T in propositional logic. 
Thus it is easy to see that Tro is PFM and CIRC is at least as expressive as 
CL. Let us then assume there is also a PFM translation function Tr in the other 
direction. Let A = {a, b} and £ the corresponding propositional language. Then 
consider a minimal model theory {P,F,T) based on £ where T = {a — *■ b}, 
P — {a, b} and F — This has a unique (P, P)-minimal model = 0 so that 
a unique (P, P)-extension E = Cnd-^a, -'b}) results. Thus the propositional 
translation Tr((P, P, T)) must entail -la and ^b. However, if we update T to T' = 
TU {a}, there is a unique (P, P)-extension E' = Cn({a, b}) of T' . By modularity, 
the translation Tr((P, P, T')) has to be Tr((P, P, T)) U {a} which is necessarily 
propositionally inconsistent. Thus Tr cannot be faithful, a contradiction. 



4.1 Comparison of CIRC and AEL 

The translation function Tcn satisfies our requirements by Niemela’s results [22] . 

Theorem 13 (Niemela [22]). The translation function TrN is PFM. 

This indicates that reasoning corresponding to (P, P)-minimal models is eas- 
ily captured in terms of stable expansions of the translation and that AEL is at 
least as expressive as CIRC. In Theorem 14, we adopt a counter-example given 
by Niemela [23] to show that there is no translation function meeting our criteria 
in the opposite direction. Thus CIRC is less expressive than AEL. 

Theorem 14. There is no PFM translation function from autoepistemic theo- 
ries under stable expansions into minimal model theories. 

Proof. Let A — {a, b} and let £ and £b be the respective propositional and au- 
toepistemic languages. Let us make a hypothesis that there is a fixed polynomial 
translation of E — {Ba ^ b} C £b into sets of atoms P and P and a propo- 
sitional theory Tr(I7) such that for all T C £, the propositionally consistent 
stable expansions of (A, T) and (P, P)-extensions of Tr(P) UP are in one-to-one 
correspondence and coincide up to £. The language £' of Tr(P) UT is assumed 
to be based on A' £ A and the sets of atoms P and P are subsets of A'. 

For T = 0, there is exactly one propositionally consistent stable expansion 
A — {^Ba, ^Bb, . . .} of (P, T) such that a ^ b ^ Z\. It follows by our hypothesis 
that there is a unique (P, P)-extension E of Tr(P) UP such that Ar]£ — Er\£. 
This implies that a ^ b ^ P. So there is a (P, P)-minimal model M. of Tr(P)UP 
such that M. ^ a i.e. M \= a and A1 b. Then let T' = {a} so that 
also M. ^ Tr(I7) UP'. It is easy to see that A1 is a (P, P)-minimal model of 
Tr(P)UP', since otherwise M. would not be a (P, P)-minimal model of Tr(A')UP. 
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It follows that b does not belong to the corresponding (propositionally consistent) 
{P, F)-extension E' of Tr(I7) U T' . By our hypothesis, there is a propositionally 
consistent stable expansion A' of {E,T') such that b ^ A' . But this is a contra- 
diction, since the only stable expansion of (H, T') is {a, Ba, b, Bb . . .}. □ 



4.2 Comparison of AEL and SAEL 

The author [11] presents a translation function that allows one to capture the 
stable expansions of an autoepistemic theory with the iterative expansions of 
the translation. We let RBa(</>) denote the set of belief atoms that appear in an 
autoepistemic sentence </> recursively and define RBa(Il') = lJ{RBa((/)) | </> G E} 
for sets of autoepistemic sentences E. To give a simple example, we note that 
RBa(B(BpABq)) = {B(BpABq),Bp,Bq}. The intuition behind the translation 
is that the positive introspection (BZ\) in the definition of stable expansions is 
realized using instances ^B^B(/> ^ B(/> of the axiom schema 5. 

Definition 15 (Janhunen [11]). For all autoepistemic theories {E,T), the 
translation Tri((i7, T)) = {E U {^B^B(/> — ^ B(/> | B(/> G RBa(T')}, T). 



Theorem 16. The translation function Tri given in Definition 15 is PFM. 

Proof. The polynomiality and modularity of Tri are easily seen from the def- 
inition. Faithfulness follows by the results of Marek et al. [16] and the au- 
thor [11], namely the propositionally consistent stable expansions of {E, T) and 
the iterative expansions of Tri((I7, T)) coincide. This implies the one-to-one 
correspondence of propositionally consistent expansions as required by Defini- 
tion 9 so that the translation function Tri is also faithful. Note that Tri is 
not faithful in Gottlob’s sense [9], since an autoepistemic theory (11,0) where 
E = {Bp ^ p A r, Bq — ^ q A ^r} [11] has a propositionally inconsistent stable 
expansion A = £b which is not an iterative expansion of Tri ((17, 0))- 

Theorem 17 shows that a PFM translation in the opposite direction is not 
possible. The proof is obtained by modifying Gottlob’s proof [9] which shows 
that a modular translation from DL into AEL cannot be realized (an analog 
of this result is considered later as Gorollary 22). Theorems 16 and 17 signify 
together that AEL is less expressive than SAEL. 

Theorem 17. There is no PFM translation function from autoepistemic theo- 
ries under iterative expansions into such theories under stable expansions. 

Proof. Let A = {a, b} be a set of atoms and let C and £b be the respective 
propositional and autoepistemic languages. Let us then make a hypothesis that 
there is a fixed polynomial translation of 17 = (Ba b,B(a ^ b) — > a} C 
£b into E' C and T' C C' such that for all T C £ the propositionally 
consistent iterative expansions of (17, T) and the propositionally consistent stable 
expansions of {E' ,T' UT) are in one-to-one correspondence and coincide up to 
C. The languages C and £q are assumed to be based on a set of atoms A' A A. 
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Note that we may assume a single translation Tr(li') = E' \JT' without a loss 
of generality, since {S' ,T' U T) and (Tr(I7), T) are effectively the same in AEL. 
Let us then introduce propositional theories Tq = 0, Ti = {a}, T 2 = {a ^ b} 
and Ts = {a, a ^ b}. For i G {1,2,3}, the autoepistemic theory {E,Ti) has 
a unique, propositionally consistent iterative expansion A — {a, Ba, b,Bb,a — > 
b,B(a^b)|. As Tr is faithful, there are unique propositionally consistent stable 
expansions Z\' of (Tr(I7),Ti) that coincide with Z\ up to £. 

Since a ^ b G Z\ n £, it follows that a ^ b G A}. Since A[ is a stable 
expansion of (Tr(I7),Ti), it holds that A} = Cn(Tr(I7)UTi UBA}U^BA{) and 
thus Tr(I7) U Ti U BA{ U ^BA{ ^ a ^ b. It follows that also A} = Cn(Tr(I7) U 
T 3 U BA{ U ^BA{), i.e. A} is a stable expansion of (Tr(A'),T 3 ). Because a G 
A n £, it follows similarly that A '2 is a stable expansion of (Tr(A'),r3). Then 
A} = A '2 = Ag is necessarily the case, as Ag is the unique stable expansion of 
(Tr(A'), T 3 ). So let A' denote any of A- with i G (1, 2, 3}. Since b G A n £, we 
know that b G A'. Then it follows that Tr(A') U (a) U BA' U ^BA' ^ b and the 
deduction theorem of propositional logic implies Tr(A') U BA' U-BA' ha^b. 
Thus A' = Cn(Tr(r)UBA'U-BA'), i.e. A' is a propositionally consistent stable 
expansion of (Tr(I7),To). Then (A, Tq) has a propositionally consistent iterative 
expansion which contains both a and b. But this is contradiction, since the only 
iterative expansion of {E, Tq) is A" = {^Ba, ^Bb, ^B(a^ b), . . .}. □ 



4.3 Comparison of SAEL and DL 

The author [11] has proposed an idea of representing autoepistemic introspection 
in terms of default rules. In this approach, default rules of the forms ^ and 
capture the positive and the negative introspection of an autoepistemic sentence 
(p G £ b , respectively. A translation function is obtained as follows. 

Definition 18 (Janhunen [11]). For all autoepistemic theories {E,T), let 
Tt2{{E,T)) = ({^ I B0 g RBa(r)} u (^ | B</. G RBa(r)}, A U T). 

The propositional language £' of the translation Tr 2 ((A', T)) is assumed to 
contain atoms that correspond to the belief atoms in RBa(A') exactly. 

Theorem 19. The translation function Tr 2 given in Definition 18 is PFM. 

Proof. The translation function Tr 2 is clearly polynomial and modular. For the 
faithfulness of the translation, we refer to results shown by the author elsewhere 
[11, Theorem 13 and Proposition 16]. First of all, the author shows that the 
iterative expansions of an autoepistemic theory E C £b and the extensions 
of a translation ({^ I ^ G £b} U \ G Tb}, D) coincide (note that this is 
an extended and infinite translation). A translation obtained by Tr 2 is limited 
to belief atoms in RBa(A') and thus it captures essentially full sets [8] of E. 
This implies the one-to-one correspondence between the iterative expansions of 
{E, T) and the extensions of Tr 2 ((A', T)). In addition, the propositional parts of 
expansions and extensions in question coincide. □ 
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A number of principles have been proposed to translate default theories 
into autoepistemic ones. Basically, the problem is to translate a default a : 
Pi, . . . , PnH i>^to an autoepistemic sentence. Konolige [13] introduces a transla- 
tion Ba A A ... A ^B-i/3„ — > 7 for a default. Unfortunately, the resulting 

translation function Trx for default theories is not faithful in general, as shown 
by Marek and Truszczyhski [17]. As a response to this problem, they handle 
justifications Pi differently: ^B-i/3i is replaced by ^BB-i/3i in their translation. 
Later, Truszczyhski [27] ends up with a translation B^B^pi for justifications. 
This gives rise to a translation function for default theories as follows. 

Definition 20 (Truszczyhski [27]). For all default theories {D,T), define 
TrT((A T)) = ({Ba A B^B^pi A ... A B-B-/3„ ^ 7 | G D}, T). 

Theorem 21 (Marek and Truszczyhski [18]). The translation function 
Trx given in Definition 20 is PFM. 

It is worth mentioning that the above result holds as long as the notion of 
faithfulness takes only the propositionally consistent expansions of the trans- 
lation into account. Niemela [22] demonstrates that for a set of defaults D — 
{^, and a theory T = {a} the translation TrT((D,T)) has an inconsis- 
tent iterative expansion while {D, T) has no extensions. However, Gottlob [9] 
proposes a variant of Trx that avoids such inconsistent iterative expansions. 

Since PFM translations exist in both directions, we conclude that SAEL and 
DL have an equal expressive power according to the measure set up in Section 
3. Note also that the theorems presented so far constitute an indirect proof of 
the following corollary. Therefore, Gottlob’s intranslatability result [9] remains 
valid although a weaker notion of faithfulness is applied. 

Corollary 22 (Gottlob [9]). There is no PFM translation function from de- 
fault theories into autoepistemic theories under stable expansions. 

Proof. Assume there is such a function. By Theorem 19 and Proposition 11, there 
is a PFM translation function that maps autoepistemic theories under iterative 
expansions to ones under stable expansions. But this contradicts Theorem 17. 

□ 

In spite of this result, Gottlob [9] sets up a non-modular translation function 
Trc to capture the extensions of a default theory {D, T) with the stable expan- 
sions of TrG((D, T)).® Then he provides a counter-example showing that faithful 
translations in the other direction are not possible and concludes then that DL 
is less expressive than AEL. This conclusion is in contrast with Theorems 16 
and 19 and Proposition 11 which indicate that there is a PFM translation func- 
tion (the composition of Tri and Tr 2 ) for this purpose. The difference between 
the two views is due to the notions of faithfulness considered. Gottlob assumes 
that the language L of the default theory {D,T) obtained as a translation of 

Schwarz [25] proposes an alternative translation for this purpose. 
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an autoepistemic theory S C (under stable expansions) is the propositional 
sublanguage of £b- However, we are ready to introduce new atoms to extend C. 

To illustrate the effect of new atoms, we construct a default theory in order 
to capture the (propositionally consistent) stable expansions of an autoepistemic 
theory S — {Bp p} used in Gottlob’s counter-example [9] . 

Example 23. Let A — |p} and let L and £b be the respective propositional and 
autoepistemic languages. Then let S = (Bp p} C £b- The stable expansions 
of (If, 0) are Z\i = |^Bp, B^Bp, . . .} and A 2 = (Bp, p, ^B^Bp, . . .} so that 
Z\i n £ = Cn(0) and Z \2 n £ = Cn({p}). Since Cn(0) C Cn({p}), Gottlob [9] 
concludes that there is no default theory with extensions Gn(0) and Gn({p}), 
because the extensions of a default theory form an antichain [24] . 

As the first step, we apply Tri and add an instance of the schema 5 to If 
and obtain If' = {Bp p, ^B^Bp ^ Bp}. It is easy to see that Z\i and Z \2 are 
the iterative expansions of (lf',0). In particular, note that ^Bp ^ Z \2 implies 
that ^B^Bp G ^BZ \2 so that Bp and p are B-provable from If' U ^BZ\ 2 , since 
If' contains the critical instance ^B^Bp ^ Bp of 5. 

The next step is to apply Tr 2 . An extended propositional language £' based 
on a set of atoms A' = A U {Bp,B^Bp| is introduced. The set of defaults 
introduced by Tr 2 is D — {^1 s^sp ’ ^B^Bp }- Consequently, the exten- 
sions of the resulting default theory (If, If') are Ei = Gn(If' U {^Bp,B^Bp}) 
and £2 = Gn(If' U {Bp, ^B^Bp}), because the reductions of D are De^ = 
{p/Bp, T /^Bp, ^Bp/B^Bp} and De 2 — {p/Bp, ^Bp/B^Bp, T /^B^Bp}. It fol- 
lows that £1 n £ = Gn(0) and £2 n £ = Gn({p|) . Thus the stable expansions of 
(If, 0) and the extensions of (£, If') coincide up to £. In particular, the extended 
language £' allows the relationship £1 n £ C £2 n £, although £1 2 ^2 • 

Bonatti and Eiter [2] analyze the expressive power of non-monotonic logics 
as query languages for disjunctive databases. A comparison with our results is 
possible in the propositional case, if a restriction to empty databases is made. 
Then Theorems 6.3 and 7.3 in [2] speak about the intertranslatability of non- 
monotonic theories. These theorems involve two translation functions, Tcbe and 
Trx. The latter function Trx is due to Konolige [13], and it allows one to capture 
the extensions of a prerequisite-free® default theory (£, T) in terms of stable ex- 
pansions of the translation TrK((£, T)). The results by Marek and Truszczyhski 
[18, Section 12.5] and Gottlob [9] suggest that this translation is PFM in our 
sense, implying that AEL is at least as expressive as prerequisite- free default 
logic (PDL). It follows by Gorollary 22 and compositionality that there is no 
PFM translation from default theories into prerequisite- free ones. Interestingly, 
the translation function Tcbe is proposed to remove prerequisites from a default 
theory. However, such a translation cannot be PFM by our remarks above. It 
seems that TrBE is polynomial and modular so that TrBE cannot be faithful in 
our sense. Indeed, the idea behind TrBE is to simulate the defaults of the origi- 
nal default theory (£, T) without actually applying them and consequently the 
extensions produced for the translation TrBE ((!?,£)) do not coincide with the 



A default a : j3i, . . . , dn/7 is called prerequisite-free, if a = T. 
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extensions of {D,T) up to C (i.e. the language of {D,T)). Moreover, Theorem 

6.3 in [2] does not establish a one-to-one correspondence of extensions. 

Marek et al. [15] and Engelfriet et al. [ 6 ] propose mappings to translate 

a default theory {D, T) into a prerequisite-free one such that extensions are 
preserved. However, these translations introduce a new default for each quasi- 
proof which is a sequence of defaults from D. Consequently, these translations 
are not polynomial in general and no contradiction arises with Corollary 22 as 
discussed above. Let us also note that the latter approach [ 6 ] deals with infinitary 
defaults whereas only finite defaults and default theories are considered here. 

4.4 Comparison of DL and PL 

Wang, You and Yuan [28] propose a translation of a default theory {D, T) into 
a priority theory {R, P, T). The idea is to break a default a : /?i, . . . , PnH G ^ 
to inference rules a/y, . . . , to be included in R. The priority 

relation P is chosen such that the rule a/j has a lower priority than the rules 
-'/3i/-'/3i, . . . , -^Pn/^Pn- As reported by Wang et al., this translation is faithful 
only if dissimilar sets of defaults D are considered, i.e. sets of defaults D which 
do not contain two defaults with exactly same prerequisite a and consequent 7 . 
Wang et al. argue that this restriction is not significant, since it is possible to 
differentiate the prerequisites of defaults without changing their semantics. An 
unrestricted translation function introduces a new atom p^ for each d G D. 

Definition 24. For all default theories (D,T), the translation Tryf{{D,T)) = 
{R, P, T) where R and P are such that for each d = g 

rules a A (pd V ^Pd )/7 and . . . , -'/3n/“'/9n belong to R and (ii) the rule 

a A (pd V ^Pd )/7 is in the relation P with the rules . . . , -^(dn/^Pn- 

Theorem 25. The translation function Tr\v given in Definition 24 is PPM. 

Proof. It is clear that Tr\v is polynomial and modular. To establish the faithful- 
ness of Tr\v, we note that adding the tautology pd V ^pd to the prerequisite of a 
default d G D does not affect the applicability of the default d by any means. Let 
D' denote D modified in this way. It is clear that the extensions of {D, T) and 
{D',T) coincide up to the language £ of {D,T). Since D' is definitely dissimi- 
lar, the translation function Tr\v is faithful by the one-to-one correspondence of 
extensions established by Wang et al. [28, Theorem 8 ]. □ 

A translation function in the other direction can also be obtained and it 
seems that one cannot do without new atoms in this case. The idea is that an 
atom a.r is introduced to denote that a rule r of a priority theory {R, P, T) is 
applied. Then the priority relation P of the priority theory is easily representable 
in terms of the justifications of defaults. Note that finiteness of R is essential in 
this translation: a single default is sufficient to represent a rule r G R. 

Definition 26. For all priority theories {R, P, T), the translation Tr 3 ((i?, P, T)) 

is {D,T) where D contains for each rule r = a/j G R a, default 

where ri, . . . , r„ are all the rules of R such that (r, ri) G P,. . (r, r„) G P. 
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Theorem 27. The translation function Trs given in Definition 26 is PFM. 

Proof sketch. It is obvious that Tra is polynomial and modular. Let us then 
sketch how Tra is proved faithful. Consider a priority theory {R, P, T) and the 
set of defaults D introduced by Tra. Let L be the language of (i?, P, T) based on 
a set of atoms A. Thus the language C of the resulting default theory (P, T) is 
based on a set of atoms A' = AiJ {&r\r £ R}. 

(i) Consider an extension P C £ of (P, P, T) based on a set of rules R' C R 
satisfying the stability condition of Definition 6. Define E' = Cn(P U A') C 
C where A' is the set of atoms {a^ \ r G R'j. Then the reduct D^' contains 
the inference rule aj^ A&r if and only if r = a/y belongs to Nb(P, P, P'). 
Consequently, it can be shown that E' is the unique extension of (P, T) with 
the property {r € P | a^ G P'} = PL (ii) Then assume that there is an extension 
E' C C' of (P, T) and let R' = {r G R\&r G P'}. It follows that R' satisfies the 
stability condition. It follows that P = Cn'^ (T) = P' C £ is an extension of 
{R,P,T). The steps (i) and (ii) above establish a one-to-one correspondence of 
extensions. Moreover, these extensions coincide up to the language £. □ 

The results of Theorems 25 and 27 entitle us to conclude that default logic 
and priority logic are of equal expressive power. It is also worth pointing out 
that the translations presented lead to straightforward reductions between the 
decision problems of DL and PL corresponding to brave and cautious strate- 
gies. Thus our results and the complexity results on DL [8] have the following 
corollary. 

Corollary 28. The decision problems of PL corresponding brave and cautious 
reasoning strategies are 'S'^-complete and H^-complete problems, respectively. 

5 Conclusions 

A framework of polynomial, faithful and modular (PFM) translation functions is 
proposed in this paper to classify non-monotonic logics by their expressive power. 
If there is a PFM translation function that maps theories of one non-monotonic 
logic Li into theories of another £ 2 , the sets of conclusions induced by a theory 
of Li - which determine the semantics of the theory - are effectively captured by 
the sets of conclusions induced by a theory of £ 2 - This is interpreted to indicate 
that the non-monotonic logic £2 is at least as expressive as £ 1 . A number of 
translation functions are considered, and three novel translation functions Tri, 
Tr 2 and Trs are proposed in the paper for classification purposes. The first two 
are merely obtained by modifying existing translation functions while the last 
is completely new. It is established that these translation functions are PFM. 
Two impossibility proofs are also provided to establish strict relationships in the 
expressive power of non-monotonic logics under consideration. 

To conclude, the comparisons made in the paper give rise to a classifica- 
tion illustrated in Figure 1. Classical propositional logic CL is also included in 
the figure to complete our view. Solid arrows denote PFM translation functions 
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from one non-monotonic logic to another that are considered in this work. Dot- 
ted arrows in the figure denote translation functions obtained as compositions 
of others. Such compositions are not necessarily optimal for a particular pur- 
pose. Note, for instance, that many unnecessary atoms and sentences would be 
introduced if a minimal model theory (P, F, T) were translated into a default 
theory using TrN, Tri and Tr 2 that involve the intermediate representations 
of (P, P, T) as an autoepistemic theory. Nevertheless, the resulting translation 
function Tcn o Tri o Tr 2 is still PFM by compositionality. 

The non-monotonic logics under con- 
sideration are divided in three equiva- 
lence classes by their expressive power. 

The strongest class contains SAEL, DL 
and PL. The class below this contains 
AEL which is less expressive than SAEL, 

DL and PL. The third and the least ex- 
pressive class contains CIRC which is less 
expressive than AEL. Below these three 
classes, there is the fourth class contain- 
ing CL which is less expressive than any of 
the non-monotonic logics considered. The 
relationships depicted in Figure 1 refine 
the classification of non-monotonic logics Fig. 1: Non-monotonic Logics Ordered 
based on earlier results [2,7,9,10,23] on the by Their Expressive Power 
expressive power of non- monotonic logics. Finally, we want to emphasize that the 
ranking of non-monotonic logics by their expressive power is very sensitive to the 
requirements imposed on translation functions. It is demonstrated in this paper 
how a slight change in the notion of faithfulness changes the relative ordering of 
AEL and DL to the opposite compared to Gottlob’s results [9]. 

Future Work. The notion of modularity considered in the paper is rather 
weak, i.e. only changes in the propositional subtheory are tolerated. We expect 
that the translations considered are also modular in a stronger sense and thus 
a stronger notion of modularity can be introduced such that the classification 
depicted in Figure 1 remains intact. For instance, changes in the defaults of a 
default theory {D, T) cause only local changes in the respective sentences of the 
translation TrT((D, T)). Moreover, there are also other non-monotonic logics as 
well as variants of those considered in the paper. These logics should be analysed 
in terms of PFM translation functions in order to classify them in the hierarchy. 
For instance, PDL is interesting in this respect on the basis of Section 4.3. 
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Abstract. We introduce new concepts for default reasoning in the context of 
query-answering in regular default logic. For this purpose, we develop a proof- 
oriented approach for deciding whether a default theory has an extension contain- 
ing a given query. The inherent problem in Reiter' s default logic is that it necessi- 
tates the inspection of all default rules for answering no matter what query. Also, 
default theories are known to lack extensions occasionally. We address these two 
problems by sloting in a compilation phase before the actual query-answering 
phase. The examination of the entire set of default rules is then done only once 
in the compilation phase; this allows us to inspect only the ultimately necessary 
default rules during the actual query answering phase. In fact, the latter inspec- 
tion must not only account for the derivability of the query, but moreover it must 
guarantee the existence of an encompassing extension. We address this tradition- 
ally important problem by furnishing novel criteria guaranteeing the existence of 
extensions that are arguably simpler and go well beyond existing approaches. 



1 Introduction 

In many AI applications default reasoning plays an important role since many subtasks 
involve reasoning from incomplete information. This is why there is a great need for 
corresponding implementations that allow us to integrate default reasoning capabilities 
into complex AI systems. For addressing this problem, we have chosen Reiter' s r/e/aw/t 
logic [11] as the point of departure; Default logic augments classical logic by default 
rules that differ from standard inference rules in sanctioning inferences that rely upon 
given as well as absent information. Knowledge is represented in default logic by de- 
fault theories {D, W) consisting of a set of formulas W and a set of default rules D. 
A default rule has two types of antecedents: A prerequisite a which is established 
if a is derivable and a justification (3 which is established if /3 is consistent in a certain 
way. If both conditions hold, the consequent 7 is concluded by default. A set of such 
conclusions (sanctioned by default rules and classical logic) is called an extension of 
an initial set of facts: Given a set of formulas W and a set of default rules D, any such 
extension E is a deductively closed set of formulas containing W such that, for any 
G D, if a G E and -i/3 ^ E then j G E. (A formal introduction to default logic is 
given in Section 2.) 

In what follows, we are interested in the basic approach to query-answering in de- 
fault logic that allows for determining whether a formula is in some extension of a given 
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default theory. The inherent problem with this in regular default logic is that it necessi- 
tates the inspection of all default rules, no matter which query is posed. This is related to 
the fact that default theories may not possess extensions at all. Hence, query-answering 
has so far only been addressed indirectly: On the one hand, we find approaches that are 
primarily interested in the construction of extensions; queries are then either answer- 
able by simple membership tests or by directing the construction towards extensions 
containing the query [7,14,9,3]. On the other hand, we find approaches based on vari- 
ants of default logic that guarantee the existence of extensions and that allow for local 
proof procedures [12]. Unfortunately, these variants do not offer the same expressive- 
ness as the original approach [1]. In the general case, there is thus no way of avoiding 
exhaustive computations while preserving the expressiveness of Reiter' s default logic. 
Our key to this problem is given by the discovery that, for query-answering, the inspec- 
tion of the entire default theory must only be done once, no matter which and how many 
queries are posed subsequently. This leads us to the following idea: We slot in a com- 
pilation phase before the actual query-answering process so that afterwards queries are 
answerable in a rather local fashion, insofar that the resulting procedures must examine 
only those default rules necessary for proving the query. 

Consider an example where birds fly, birds have wings, penguins are birds, and 
penguins don' t fly along with a formalization through general default theory 



(D,W) 



f b : -^abb b : w p : b p : ~^abp 

I 7 -/ 



,{~'f O'bbJ abp,p} 



(1) 



(For short we denote the default rules in (1) by <5i , i52, iJs, S 4 , resp.) An analysis of these 
rules should provide us with the information that the application of the first rule depends 
on the blockage of the last one (and vice versa), while the second and third rule can be 
applied no matter which of the other rules apply. We may thus derive ~^f by ignoring 
the second and the third rule, while assuring that the first one is blocked. Notably, our 
initial analysis must be truly global and also extend to putatively unrelated parts of the 
theory. To see this, simply add the rule destroying all previous extensions consistent 

with X. Now, the application of each rule in (1) depends additionally on the blockage 
of Thus, given p, we can only apply ^ to derive ^/, if ^ and are 
blocked. 

We note that the application of rules depends on the blockage of other rules. The 
outcome of our analysis must thus provide information on which rules may block other 
rules (or even themselves). But although we may allot somehow unlimited time to 
a compilation phase, we cannot provide unlimited space for its result. Our approach 
complies with this principle and offers as a result a so-called block graph whose size 
is quadratic in the number of defaults, although its computation may need exponen- 
tial time in the worst case.' The block graph represents the essential information about 
blockage between default rules. In the query-answering phase this information is then 
used to focus the computation of default proofs on ultimately necessary defaults only. 
For instance, given the block graph of Theory (1), we may derive / from (D, FF U { & }) 
by means of While this involves testifying the blockage of ^ ’’ , the two 



* The reader is reminded that query-answering in default logic has even two distinct sources of 
exponential complexity; it is H'l’ -complete [6]. 
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other rules may be completely ignored. This is what makes our approach different from 
traditional, extension-oriented ones: Once we have compiled the blocking information, 
subsequent query-answering must only consider default rules by need. 

In addition to the rules needed for deriving a query, like ^ for/, however, 
this involves considering furthermore those rules threatening the rules in the proof, 
like ^ , or even those menacing its encompassing extension, like Both type 
of rules are identifiable by means of the block graph. The first ones are necessarily 
among the predecessors (in the block graph) of the rules used for deriving the query. 
For guaranteeing an encompassing extension, we propose a range of novel criteria that 
can be directly read off the block graph, too. In fact, these criteria are arguably simpler 
and go well beyond existing approaches addressing the traditionally important problem 
of existence of extensions. This is formally proven in the full version of this paper. 

The rest of the paper is organized as follows. In Section 2 we provide a formal 
introduction to default logic. Section 3 introduces blocking (supporting) sets and the 
block graph of a default theory, which actually is the result of the compilation phase. 
Section 5 develops a new representation of extensions, which relies on blocking sets 
and is suitable for query-answering. It also allows for defining local default proofs for 
different kinds of default theories in Section 6. Section 7 grasps the presented approach 
in its entirety and discusses its modularity. 

2 Background 

We start by completing our initial introduction to Reiter' s default logic: A default rule 
is called normal if (3 is equivalent to 7; it is called semi-normal if /3 implies 7. 
We sometimes denote the prerequisite a of a default rule 6 by Pre{S), its justification 
by Just{S) and its consequent 7 by Cons{6)? A set of default rules D and a set of 
formulas W form a default theory^ A — {D, W), that may induce one, multiple or even 
no extensions [11]: 

Definition 1. Let {D, W) be a default theory. For any set of formulas S, let r{S) be 
the smallest set of formulas S' such that 

El FF C S', 

E2 Th{S') = S', 

E3 For any D,\f a & S' and -i/3 ^ S then 7 G S''. 

A set of formulas E is an extension of {D, W) iff E{E) = E. 

Any such extension represents a possible set of beliefs about the world. For example. 
Default theory (1) has two extensions: Th{W U {b,w, ^/}) and Th{W U {&, w,f}), 
while theory {D U W) (where D and W are taken as in (1)) has no extension. 

For a set of formulas S and a set of defaults D, we define the set of generating 
default rules as GD{D,S) = {6 € D \ S F Pre{6) and S 1/ ~^Just{6)}. The two 
last extensions are generated by {1)2, 1^3, <^4} and {<)i, 1)2, 1^3}, respectively. We call a 

^ This notation generalizes to sets of default rules in the obvious way. 

^ If clear from the context, we sometimes refer with A to D and W (and vice versa) without 
mention. 
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set of default rules D grounded in a set of formulas W iff there exists an enumeration 
of D such that for alH S I,W Cons({i5o, . . . , h Pre{5i). Note 

that E — Th(W U Cons{GD{D, E))) forms an extension of (D, W) if GD{D, E) is 
grounded in W . 

For simplicity, we assume for the rest of the paper that default theories {D, W) 
comprise finite sets only. Additionally, we assume that for each default rule 6 in D, we 
have that W U Just{S) is consistent. This can be done without loss of generality, because 
we can clearly eliminate all rules S' from D for which W U Just(S') is inconsistent, 
without altering the set of extensions. 

3 Compressing Blocking Relations by Block Graphs 

Our approach is founded on the concept of blocking sets: Given a default theory {D, W) 
and a default rule S G D, intuitively, a blocking set for <5 is a minimal set of default 
rules BCD such that the joint application of its rules denies the application of i5. 
Such a blocking set provides a candidate for disabling the putatively applicable default 
rule 6. For this purpose, it is actually sufficient to refute a rule's justification, ignoring 
its prerequisite. This is because an existing derivation of a prerequisite can only be 
counterbalanced by refuting the justification of one of its default rules. This motivates 
Condition BSl in the formal definition of blocking sets, given below. 

In order to become effective, a blocking set must belong to the generating default 
rules of an extension. Thus, it must be grounded and the respective justifications must 
be consistent with the extension. Groundedness is indispensable and also easily verifi- 
able, leading to Condition BS2. Global consistency however is context-dependent since 
it makes reference to (possible) extensions encompassing the blocking set. In fact, in 
many cases, we are rather interested in showing that a critical blocking set does not 
contribute to a certain extension, either because it threatens a default proof at hand 
or even because it menaces the extension as such. Therefore, this (context-dependent) 
consistency condition must be finally addressed in the context of the respective query- 
answering process (see Section 6), so that we confine ourselves to a rather local notion 
of consistency reflected by Condition BS3. 

This leads us to the following definition of (putative) blocking sets: 

Definition 2. Let A = (D, W) be a default theory. For 5 £ D, we define the set Ba (S) 
of all blocking sets for <5 in Z\ as follows. 

If B C D, then B G Ba (i 5) iff B is a set such that 

BSl WUCons{B) h -nJust{S), 

BS2 B is grounded in W, 

BS3 BSl and BS2 for no S' and no B' 

where S' G B U {i5} and B' — B \ {i5"} for S" G B. 

Note that the set of consequences, Cons{B), is not required to be consistent. This is 
needed, for instance, to detect groups of default rules whose joint application blocks 
any other default, like { ^ } . 

The problem of deciding whether a set B satisfies BSl and BS2 is co-NP-complete 
[13]. BS3 is a NP-complete problem. Notably, BS3 indicates that B is minimal wrt 
set inclusion and that B does not contain both a default and some of its blocking sets. 
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Minimality is confirmed by confuting BSl or BS2 for B' = i? \ {i5} and S' = S for 
all (5 G B\ while the denial of the latter is addressed by taking S' G B and B' = 
\ {(5"}. The set of all blocking sets for a(ll) default rule(s) can thus be computed with 
an algorithm iterating over 2^ with appeal to an NP-oracle. 

We show in the full paper that in the worst case, a theory with n rules may comprise 
0(2") blocking sets. However, the number of blocking sets is not related to the number 
of extensions of a given theory. To see this, observe that Theory | i = 

l..n}, 0) has 2" extensions but only 2n blocking sets. That is, although we encounter 
an exponential number of extensions, we have only a linear number of blocking sets. 
For illustration, consider Theory (1) along with its blocking sets given in (2): 

Ba{Si) = {{S^}} Ba{S2) = % 

Sa(^3) = 0 Sa(54) = {{^1,53}} 

For example, {(54} is the only blocking set for <5i, because it comprises a possible refu- 
tation of abb, the justification of 5i. In general, a single default rule may have multi- 
ple blocking sets. For example, adding to Theory (1) augments each set Ba (Si) 
by { }■ The addition of to (1) leaves blocking sets (2) unaffected and yields 

BA(i^) = }} reflecting self blockage. 

Observe that BS3 allows us to discard blocking sets that block their own constituent 
rules. For instance, |(5i, ^4} is a putative blocking set of <52; it is ruled out by BS3 

since it contains both 5i and one of its blocking sets, {<54}. 

Now, given the concept of blocking sets, we are ready to define the outcome of our 
compilation phase: The block graph of a default theory. 

Definition 3. Let A — (D, W) be a default theory. The block graph Fa = (Va, Aa) 
of Z\ is a directed graph with vertices Va = D and arcs 

A A = 1(5, (5') I there is some B G Ba (S') with S G B} . 

So, there is an arc (5', 5) between default rules S' and S in the block graph iff S' belongs 
to some blocking set for <5. For Default theory (1), we obtain the block graph given in 
Figure 1; it has arcs (<54, 5i), (5i, ^4) and ((53,54). 




Fig. 1. Block graph of Default theory (1). 

We observe that the size of the block graph is quadratic in the number of default 
rules, although there may be an exponential number of blocking sets. The block graph 
thus comprises the essential information from the blocking sets; this is accomplished by 
abstracting from the membership of default rules in specific blocking sets. It is actually 
unnecessary to keep any (possibly exponential number of) blocking sets beyond the 
compilation phase, because they can be effectively reconstructed from Fa during the 
query-answering phase. 
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Notably, we do not even have to keep or recompute blocking sets during the compi- 
lation phase itself, since the block graph is computable incrementally: 

Lemma 4. Let A = {D, W) and A' = {D' , W) be default theories with D' C D. 

If 5 € D' , then B G Ba' (i5) implies B € Ba (i^)- 

The usage of the block graph is further detailed in the following sections. 

A key issue during query-answering is to cover the safe application of default rules 
needed for proving a query. For this, we must guarantee that all (possible) blocking sets 
of such rules are blocked themselves. This leads us to the concept of supporting sets, 
which are intuitively simply blocking sets for blocking sets. First, we extend the notion 
of blocking sets to sets of rules: For a default theory A = {D, W) and sets B, B' C D, 
we call B' a blocking set for B, written B' B, if there is some default rule 6 G B 
such that B' G Ba (^)- 

Definition 5. Let A = (D, W) be a default theory. For S G D, we define the set Sa (i 5) 
of all supporting sets for 5 as 

Sa (i^) = U . . . U \ C D such that B^ and Ba {^) = {Bi, . . . , B„}} 

provided Ba (i 5) ^ 0. Otherwise, we define it as Sa (i5) = {0}- 

Observe that Sa (5) = 0 whenever Ba = 0 iri some Bi G Ba (i 5), because 

then for Bi there is no set of default rules f?' such that )^a Bi, that is, U . . . U B'^ 
is undefined. 

The purpose of supporting sets is to rule out blocking sets as subsets of the gener- 
ating default rules: Once a supporting set for 6 has been applied (ie. it belongs to the 
generating default rules), <5 itself can be applied safely. The supporting sets in Theory ( 1 ) 
are given in (3): 

54(5i) = {{^3,5i}} 5^(52) = {0} 

5 ^ (^ 3 ) = { 0 } 5 ^ (^ 4 ) = {{^ 4 }} 

For example, consider the supporting set for <5i : We have to find one blocking set for 
each blocking set in yB/i (i5i) = {{(54}}. In this easy case, we have to find some blocking 
set for 64,, yielding {5s, 5i}. Here, {5a, 5i} is the only supporting set for <5i. Similarly, 
for (54, we have to find a blocking set B' for {5a, 5i} (see (2)). That is, we must have 
B' G Ba (5a) or B' G Ba (5i). Because Ba (5a) is empty, we get {54} G Ba (5i) 
as the only supporting set for 54. The occurrence of 54 in its supporting set is due to 
the fact that there is a direct conflict between 54 and its blocking set. This says that 54 
is safely applicable on its own. In general this need not be the case. For example, in 
default theory ({^,{;|,{;^,},0) the last rule forms the single supporting set for the 
first one. 

4 Existence of Extensions 

For query-answering, it is clearly important to know whether a default theory has an 
extension, since reasonable conclusions must reside in such an extension. In fact, deter- 
mining whether a default theory has an extension or not is on itself a major computa- 
tional problem, pertinent to default logic. In previous works, broad subclasses of default 
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theories always possessing extensions have been identified, among them we find nor- 
mal [11], ordered [5], ps-even [10],"^, and strongly stratified [2] default theories. In this 
section, we provide a class of default theories guaranteeing the existence of extensions 
that are more general than the previous ones (see Section 7). The corresponding criteria 
are thus not only a salient part of our approach to query-answering, but they moreover 
represent an important contribution on their own.^ 

The basic idea is to further exploit the structure of blocking graphs for providing 
sufficient conditions on whether the underlying default theory has an extension. 

To begin with, we call a default theory {D, W) non-confiicting, if it has no block- 
ing sets, that is, if its block graph has no arcs; otherwise we call it conflicting. 
Non-conflicting default theories have unique extensions and trivially allow for query- 
answering without consistency checks®. 

Lemma 6. Every non-confiicting default theory has a single extension. 

For instance, default theory ({^i non-conflicting, yielding a block graph 

with no arcs. The same holds for Theory ( 1 ), when eliminating either ® or ^ . 

More interestingly, we call a default theory well-ordered, if its block graph is 
acyclic. The next result shows that well-ordered default theories have single extensions. 



Theorem 7. Every well-ordered default theory has a single extension. 

For instance, default theory ({ ^, ' }i 0) is well-ordered; its block graph contains 
a single arc, indicating that the first rule may block the second one (but not vice versa). 

We call a default theory even, if its block graph contains cycles with even length 
only.^ Our first major result of this section states that even default theories always have 
extensions: 

Theorem 8. Every even default theory has an extension. 

For instance, default theory ( { ' 0) is even; its block graph contains two 

arcs, indicating that the first rule may block the second one, and vice versa. 

Evenness is also enjoyed by our initial default theory in (1), as can be easily verified 
by regarding its block graph in Figure 1. Unlike this, default theory 0) is not 

even, since its block graph contains an odd cycle, namely one of length one. 

In fact, the elimination of putative extensions is somehow always due to odd cy- 
cles. Not all of them, however, do lead to the destruction of extensions. For example, 
the theory consisting of ' , and no facts has no extension; its block 

graph contains a cycle of length three, say {5 c, 5b, 5 a) (where the indexes refer to conse- 
quents). Adding formula c ^ b yields actually a theory, whose only extension contains 
c and b. Although the block graph of this theory contains the an odd cycle, it does now 

* We use ps-even for referring to the notion of evenness due to Papadimitriou and Sideri [10]. 

^ For continuity, we delay a detailed comparison with the aforementioned approaches to Sec- 
tion 7. 

® Observe that although non-conflicting default theories ignore justifications, they are still non- 
monotonic because extensions may be invalidated after augmenting a non-conflicting theory. 

’ The length of a directed cycle in a graph is the total number of arcs occurring in the cycle. 
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contain additionally arc {5a, i5c), which counterbalances the self-blocking-behavior of 
the odd cycle. 

The salient advantage of evenness and well-orderedness is that these properties are 
simple, that is, they can be tested in polynomial time, and they rely on a simple data 
structure, namely a block graph. And even more importantly, they apply to general de- 
fault theories and are syntax-independent, unlike other approaches [5,10] that apply to 
semi-normal default theories only* and give different results on equivalent yet syntac- 
ticly different theories (see Section 7 for more details). Even though we put forward 
evenness (or well-orderedness) for testing existence of extensions, it is already worth 
mentioning that this is no characteristic feature of our approach to query-answering, 
since it is actually modular in allowing for alternative checks, provided that they are 
more appropriate. 

5 Formal foundations of query-answering 

This section lays the formal foundations of our approach to query-answering. We have 
already seen in the introductory section that we need additional efforts for backing up a 
default proof. This involves two tasks, namely, protecting the constituent default rules 
of a default proof and assuring an encompassing extension. 

The first task is accomplishable by blocking and supporting sets: 

Definition 9. Let A — {D, W) be a default theory. A set of default rules D' C D is 
protected in D iff for each 5 G D' we have that (i) S C D' for some S G Sa (i 5) and 
(ii) B C D' for no B G Ba (i^)- 

In words, a set of defaults is protected if it contains some supporting set for each con- 
stituent default and if it contains no blocking set for any of its defaults. 

The second task is actually more crucial. This is because there are default theories 
without extensions and, in view of local query-answering, the test for an encompassing 
extension should be accomplishable without actually computing such an extension. For 
this purpose, we can build on the criteria developed in the last section, which allow us 
to guarantee the existence of an extension of a default theory by looking at its block 
graph. 

By combining the two previous tasks, we obtain an alternative characterization of 
extensions. For this, define for A — {D, W), 

AqD' as {D\{D' Cons{D')) (4) 

where D' C D and D' = {5 G D \ W U Cons{D') I <Just{5)}'^. Then, we have the 

following result. 

Theorem 10. Let A = {D, W) be a default theory and let E be a set of formulas. 
Then, E is an extension of A iff E = Th{W U Cons{D') U E') for some D' G D such 
that (i) D' is grounded in IV, (ii) D' is protected in D and (Hi) A Q D' has extension 
E'. 

* The approaches in [5,10] do not apply to so-called weak semi-normal theories (allowing for 
default rules without justifications) that have been shown to be equivalent to general default 
theories in [8]. 

^ The purpose of D' is to eliminate defaults with inconsistent justifications in Z\ 0 D'. 
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For general default theories A 0 D', Condition (Hi) is implementable by weak odd- 
, even- or well-orderedness. In this case, Theorem 10 furnishes a characterization of 
extensions in terms of blocking and supporting sets (due to Condition (ii) and (Hi)). 
In case AQ D' is semi-normal, the methods in [5,10] could work just as fine (except 
for their syntax-dependency and the more complicated data structures needed). The 
test is even trivial in case AQ D' is normal or non-conflicting. This demonstrates the 
modularity of our approach as regards verifying Condition (Hi), that is, existence of 
extensions of Z\ 0 D'. 

For query-answering, it is important to notice that the set D' in Theorem 10 needs 
not to be maximal. In fact, D' can be any grounded subset of the generating default rules 
GD(D, E) of extension E. This attributes to D' the character of an extension-dependent 
default proof: Given the set of generating default rules GD(D, E) for some extension 
E, a default proof of some formula p is simply a grounded set D' C GD(D, E) such 
that W U Cons(D') h p. In such a predetermined setting, we do neither have to care 
about the consistent application of the rules in D' (this is assured by D' C GD(D,E)) 
nor (trivially) about the existence of an encompassing extension. Both issues, addressed 
by (ii) and (Hi) in Theorem 10, are however of crucial importance, whenever there is no 
such extension at hand. 

For example, in Default theory (1), the set D' = {iJa, i5i} may serve as a default 
proof for/; it satisfies conditions (i) and (ii) because it is grounded and protected wrt 
(1). (The latter is easily verifiable in (2) and (3).) For showing that / belongs to an 
existing extension, it is now sufficient to demonstrate that A Q D' has an extension, 
notably without computing it. We get a non-conflicting theory 

AeO' = {D\({S 3 , 6 i}U{ 64 }),WU{f,b}) = {{ 62 },{P,abpJ,b}) 

which has obviously an extension, due to an empty block graph. Hence, we have shown 
that/is a default conclusion of (1) without computing the corresponding extension. The 
next corollary makes these ideas precise for query-answering: 

Corollary 11. Let A = (D, W) be a default theory and let p be a formula. Then, 
if € E for some extension E of A iffW U Cons(D') h p for some D' G D such that 
(i) D' is grounded in W, (ii) D' is protected in D and (Hi) AQ D' has an extension. 

Query-answering thus boils down to finding an appropriate set of default rules that 
is sufficient for deriving the query and for protecting itself against possible threats, 
provided that its “application” preserves an encompassing extension. 

6 Default Proofs for Query- Answering 

After the compilation phase, we are able to decide by examining the block graph 
whether a default theory is non-conflicting, well-ordered, etc. or neither of them. In 
analogy to these classes of default theories (and for conceptual clarity) we give an in- 
cremental definition of a default proof. We start with default proofs for the simplest 
case of non-conflicting default theories. 

Definition 12 (Pure default proof '°). Let A = (D, W) be a default theory and p a 
formula. A set of default rules Pq <Z D iso. pure default proof for p from A iff 
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PI WlJCons{Po)^^, 

P2 Po is grounded in W . 

The problem of deciding whether a set Pq is a pure default proof is co-NP-complete 

[13]. 

Actually, one may wonder whether we must stipulate an additional consistency con- 
dition in Definition 12, viz. 

P3 W U Cons{Po) \f ~^Just{6) for all 6 € Pq, 

because this is central for applying default rules. But here, we concentrate on the very 
simple class of non-conflicting default theories, which ensures that P3 holds anyway. 
Otherwise there would be at least one blocking set and the default theory would be con- 
flicting. As a nice consequence, pure default proofs correspond to extension-dependent 
default proofs but without necessitating such an extension. 

For example, let A” be the default theory obtained from the one in (1) by leaving 
out ^ . Then, Z\" is non-conflicting and Pq = { } is a pure default proof for 

This proof is found without consistency checking nor any measures guaranteeing 
the existence of an encompassing extension. 

We have the following result for non-conflicting default theories: 

Theorem 13. Let Abe a non-conflicting defauit theory and ip aformuia. Then p £ E 
for an extension E of A iff there is a pure defauit proof for pfrom A. 

In case we have a conflicting yet well-ordered (or even) default theory, we have to take 
supporting sets into account, because then it is necessary to protect the constituent de- 
fault rules of a default proof: Whenever we add a default rule to a partially constructed 
default proof, we have to make sure that there is some supporting set for this default (if 
necessary). Clearly, this must be done for the default rules in the supporting sets, too. 
We thus recursively add supporting sets to default proofs: Let A — {D, W) be a default 
theory and S G D a default rule. A set of default rules C C D is a compiete support of 
S in A iff C is a minimal set such that for each S' G C U {i5} there is some supporting 
set S' G Sa (S') with S' C C. 

This leads us to the concept of supported default proofs: 

Definition 14 (Supported default proof). Let A — (D, W) be a default theory and p 
a formula. A set of default rules P C D is a supported default proof for p from A iff 
P = PqIJC with C = UiGPo 

SPl Po is a pure default proof for p from A, 

SP2 Cs is a complete support of <5 for each S G Pq, 

SP3 P contains no blocking set for each S G P. 

In the full paper, we show that deciding whether a pure default proof is a supported 
default proof can be done within S 2 of the polynomial hierarchy. 

Condition SP2 ensures that there is a supporting set for each default in P, if nec- 
essary. Condition SP3 prevents P from blocking some of its own members; thus SP2 
and SP3 together imply that P is a protected set of default rules. Therefore, P3 is also 
valid for supported default proofs, because otherwise there would be a blocking set in 
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P, which is a contradiction to SP3. As an important consequence, we do not need any 
global consistency checks when querying even (or well-ordered) default theories. 

Similar checks are done locally during the determination of supporting sets. For ob- 
taining these sets during the query-answering phase, we can draw on the block graph for 
restricting the search space. To find out the blocking sets for a default <5, we only have to 
take the predecessors of 5 in the block graph into account, which form usually a limited 
subset of all default rules. The consistent application of 5 is then guaranteed by deter- 
mining one supporting set for <5 among a fixed subset of all default rules, namely the 
pre-predecessors of S. We may also think of different heuristics for a more elaborated 
(re)construction of blocking sets, like caching selected blocking sets retained during the 
compilation phases. 

For illustration, let us return to even Default theory (1). We have already seen that 
Po — {1^4} = is a pure default proof for ~^f, that is, SPl holds. Because 

Theory (1) is conflicting, we have to warrant a complete support for each default in 
Pq. First of all, it is important to note that our approach leaves plenty of room for ac- 
complishing this task, depending on whether and if so how many blocking sets were 
maintained from the compilation phase: In case we have kept Sa (1^4), we can directly 
choose one of its members, yielding Cs^ — {i54}- If not, we first verify whether it is 
actually necessary to generate a support for 84, by checking whether ^4 has predecessors 
in the block graph. Since this is true in our example (54 has predecessors 5i and S3), 
we must (re)generate one member of Sa (54). This can be done by looking at the pre- 
decessors of 5i and S3 in the block graph, which immediately yields {54} as the only 
supporting set; no matter whether or not Ba (5 i) and Ba (53) are given explicitly. In 
any case, we thus get C = Cs^ = {54}, so that we have 

P = Pq\jC= 

This establishes Condition SP2 for P. For verifying SP3 it is sufficient to observe 
that no members of P are connected in the block graph. Hence P is a supported default 
proof for ~.f from Theory (1). Given the block graph, this proof is found without any 
consistency checks and no measures guaranteeing the existence of an encompassing 
extension. 

Similar arguments show that { and -^^t} are supported default 

proofs for / and w from Theory (1), respectively. It is important to note that for estab- 
lishing this, we only had to warrant a support for since it is the only one among 

all involved default rules, having a predecessor in the block graph. As with ^ 
above, however, ^ supports itself, so that no other defaults had to be added to the 
pure default proof obtained in the first case. In the second one, the default proof is even 
found without any search for supporting sets (nor any consistency check). This illus- 
trates nicely how query-answering draws upon the information furnished by the block 
graph in order to concentrate efforts on the ultimately necessary defaults only. 

As a result, we get that supported default proofs furnish a sound and complete con- 
cept for querying even default theories: 

Theorem 15. Let A — (D, W) be an even default theory and (p a formula. Then p £ E 
for an extension E of A iff there is a supported default proof for pfrom A. 
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The stipulation of an even default theory can he replaced hy any other adequate condi- 
tion guaranteeing the existence of extensions. 

In the general case, dealing with arbitrary default theories, we must additionally 
guarantee that there is an encompassing extension, whose generating default rules com- 
prise a given (supported) default proof. As always, this should be accomplished without 
actually computing such an extension. For this, we proceed as follows. After computing 
a supported default proof P from a theory A, we check whether AqP has an extension. 
This is reflected by Condition DP2 of the following definition. The resulting default 
proof is then called a general default proof: 

Definition 16 (General default proof)- Let A = {D, W) be a default theory and ip a 
formula. A (general) default proof for ip from Z\ is a set of default rules PCD such 
that 

DPI P is a supported default proof for p from A and 
DP2 AQ P is even (well-ordered, or non-conflicting). 

Observe that DP2 can be verified by means of the block graph in polynomial time. As 
above, the stipulation of even- or well-ordered- or even non-conflicting-ness can be re- 
placed by any other condition guaranteeing the existence of extensions. This establishes 
further evidence of the modularity of our approach. 

For illustration, let us render Default theory ( 1 ) a non-even, that is an odd, theory by 
adding self-circular default rule 65 = -^. We refer to the resulting default theory as A'. 
The corresponding block graph is given in Figure 2. This makes us add ((Js, 155 ), (<) 4 , 1 ) 5 ) 




and (i55, i5i) to the block graph in Figure 2; the rest of the graph remains unchanged. A' 
has a single extension: Th{W U {6, w, ^/}). Now, let us reconsider in turn the previous 
(supported) default proofs, in the light of this change. 

We start with P = {(54} = {L_2^}. Since the set of predecessors of <54 remains 
unchanged, P is still a supported default proof; thus establishing DPI. To verify DP2 
for P, we have to construct the block graph Pa'qp of A' Q P for verifying its even- 
or well-orderedness. By the monotonicity property expressed in Lemma 4, Pa'qp is 
obtained from Pa' by simply deleting vertices and arcs. First, we delete in Pa' all 
defaults in P = {(54} along with all adjacent arcs. The same is done with <5i and <55 
because W U Cons{{S 4 }) F ^Just{6i) for 5 = 1,5 (cf. (4)). As a result, we obtain 
for Pa'qp a graph with vertices 62, (5s and no arcs, which implies that A' Q P is even 
non-conflicting. This shows that A' Q P has an extension and hence that P is a general 
default proof for from A' . 

Next, consider P' = {^a, 5i} = {^^, ^ }, the supported proof off from (1), 
in the light of additional default rule 65 = Unlike above, we do now encounter an 
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additional predecessor of <5i, namely, i55. Investigating the pre-predecessors of yields 
two candidates for forming supporting sets: ^4 and i55. Both of them can, however, be 
ruled out immediately, since their addition to P' would violate SP3. We see that their 
common consequent ~^f is inconsistent with those in P' . Also, the empty supporting set 
must not be taken into account, because 5 has predecessors (see full paper). Hence there 
is no way to construct a supporting set for <5i, and so {iJa, (5i} is no supported default 
proof, as reflected by the fact that A' has no extension containing/. 

Finally, let us consider P" = {()3, 1)2} = supported proof of w 

from (1), in the light of additional default rule 5 ^ = Although the absence of prede- 
cessors for both default rules in P^' does not urge us to determine (direct) supporting 
sets for them, we do actually need an indirect support for guaranteeing an encompass- 
ing extension. To see this, consider the block graph of A' 0 P", given in Figure 3. We 




Fig. 3. Block graph of A' 0 P". 

encounter an odd cycle in Pa'qp" which has an arc entering the odd cycle, namely 
(1)4, 1)5), indicating that S5 can be blocked by 64,. In fact, by adding 64 as an indirect 
support to P", we can eliminate the odd cycle in the block graph and thus guarantee an 
encompassing extension. In more detail, we must now verify that P" U {154} satisfies 
DPI and DP2. The fact that P" U {154} is a supported default proof is established as 
shown above. For DP2, we must inspect the block graph of A' 0 (P" U {<)4}), which 
turns out to be the empty graph. In all, P" U {154} is thus a valid default proof for w 
from A' . 

Condition DP2 is thus verified by first eliminating the defaults involved in the 
default proof from the block graph and then checking whether the resulting block graph 
is even, acyclic, or arcless, all of which is doable in polynomial time. In order to apply 
these criteria, however, we might have to add additional rules eliminating odd cycles. 

We have the following result in the general case. 

Theorem 17. Let A — (D, W) be a (general) default theory and cp a formula. Then 
(p G E for an extension E of A iff there is a (general) default proof for pfrom A. 

We have seen that the formation of default proofs benefits considerably from the usage 
of block graphs. As a major consequence, we may restrict our attention to ultimately 
necessary default rules; thus avoiding the construction of entire extensions. In fact, we 
have seen that irrelevant default rules, like for answering queries/or are always 

ignored, since they are not related to the constituent rules of the respective proofs in 
the block graph. Contrariwise, we may ignore default rules ^ and when 

applying unless they must be called in for blocking extension-menacing rules, 
like ^ . In either case, both the non-interaction of ^ and ^ with and the 
interaction between ^ and are read off the block graph. 




246 



Thomas Linke and Torsten Schaub 



As a result, all previous default proofs consist of true subsets of the generating 
defaults. Of course in the worst case, there are default theories and queries, for which 
we have to consider all generating defaults. But this is then arguably a matter of fact 
and unavoidable. 

7 Discussion 

We have introduced new concepts on default logic leading to alternative characteriza- 
tions of default proofs and their encompassing extensions. The central role among these 
concepts is played by that of a block graph. First, it allows us to determine the existence 
of extensions in polynomial time by means of simple graph operations. Second, it tells 
us exactly which default rules must be considered for applying a default rule. The ef- 
fort put into the computation of the block graph thus pays off whenever it allows us 
to validate default proofs by appeal to rather small subsets of the overall set of default 
rules. 

We have presented our concepts in the context of query-answering. The resulting 
characterization of default proofs for queries is (to the best of our knowledge) the first 
one that does not appeal to the computation of entire extensions. For this, we have di- 
vided query-answering in Reiter' s default logic in an off-line and an on-line process: We 
start with a compilation phase which results in the block graph of a default theory A . 

Fa contains the essential information about the blocking relations between the default 
rules in A and is quadratic in the number of defaults. Unlike [7,14], our approach does 
thus not suffer from exponential space complexity. The subsequent query-answering 
phase aims at finding a default proof P such that AQ P possesses an extension. This 
gives us a default proof which contains only the ultimately necessary defaults. To be 
more precise, a default rule belongs to P only (i) if it contributes to the derivation of 
the query or if it is needed (iia) for supporting a constituent rule of the proof or (iib) 
for supporting an encompassing extension. While the former is fixed by the standard 
inferential relation, the two latter are determined by the block graph. For delineating 
such a proof, we can draw on Fa for (re)computing the blocking and supporting sets 
of its constituent rules. Blocking sets are found among the direct predecessors of a rule, 
while the search for its supporting sets can be restricted to its pre-predecessors. This is 
clearly the more efficient the sparser the block graph. A default rule neither belonging 
to the actual proof nor being related to it via a path in the block graph must thus never 
be taken into account during query-answering (provided that an encompassing exten- 
sion exists). This is what makes our approach different from extension-oriented ones 
[7,14,9,3]. 

An important underlying problem is that of guaranteeing extensions of A or its 
derivate AQ P. We addressed this problem by providing a range of criteria, each of 
which can be read off the block graph in polynomial time. However, our investment 
into the construction of a block graph does not only provide us with fast checks for 
encompassing extensions, which is also a great benefit in view of repetitive queries, 
but it furnishes moreover criteria that go beyond existing ones: We show in the full 
paper (i) that every default theory satisfying the criterion proposed by Papadimitriou 
and Sideri in [10] (which is itself a generalization of the one in [4]) is an even theory 
but not vice versa and (ii) that default theories satisfying the criterion proposed by 
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Cholewitiski in [2] are orthogonal to even theories. Moreover, our criteria are fully 
syntax-independant which is not the case with any of the other approaches. Also, we 
can treat general default theories, while [4,10] apply to semi-normal default theories 
only" and [2] imposes strong conditions on the occorrence of propositional variables. 
For fairness, we draw the readers attention to the fact that all initial analysis taken by the 
aforementioned approaches are polynomial, unlike the computation of the block graph. 

Finally, it is important to note that our query-answering approach is modular in 
testing the existence of extensions. Depending on the different kinds of default theories 
(general, semi-normal or normal), we are able to choose different criteria for this test. 
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Abstract. Successor state axioms are an optimal solution to the famous 
Frame Problem in reasoning about actions — but only as far as its repre- 
sentational aspect is concerned. We show how by gradually applying the 
principle of reification to these axioms, one can achieve gradual improve- 
ment regarding the inferential aspect without losing the representational 
merits. The resulting concept of state update axioms constitutes a novel 
version of what is known as the Fluent Calculus. We illustrate that under 
the provision that actions have no so-called open effects, any Situation 
Calculus specification can be transformed into an essentially equivalent 
Fluent Calculus specification, in which at the same time the represen- 
tational and the inferential aspect of the Frame Problem are addressed. 
This alternative access to the Fluent Calculus both clarifies its role in 
relation to the most popular axiomatization paradigm and should help 
to enhance its acceptance. 



1 Introduction 

For a long time, the Fluent Calculus, introduced in [7] and so christened in [3], 
has been viewed exclusively as a close relative of approaches to the Frame Prob- 
lem [12] which appeal to non-classical logics, namely, linearized versions of, re- 
spectively, the connection method [1,2] and Gentzen’s sequent calculus [11]. The 
affinity of the Fluent Calculus and these two formalisms has been emphasized 
by several formal comparison results. In [5], for example, the three approaches 
have been proved to deliver equivalent solutions to a resource-sensitive variant 
of Strips planning [4] . 

Yet the Fluent Calculus possesses a feature by which it stands out against the 
two other frameworks: It stays entirely within classical logic. In this setting the 
Fluent Calculus constitutes a successful attempt to address the Frame Problem 
as regards both the representational aspect (since no effect axiom or any other 
axiom needs to mention non-effects) and, at the same time, the inferential aspect 
(since carrying over persistent fluents from one situation to the next does not 
require separate deduction steps for each) . Contrary to popular opinion, all this 
is achieved without relying on complete knowledge of the initial or any other 
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situation. Nonetheless the Fluent Calculus has not yet received as much attention 
in the scientific community as, say, the Situation Calculus. One reason might be 
that, due to its heritage, the relation to the mainstream calculi, and in particular 
to the Situation Calculus, has not yet been convincingly elaborated. 

The purpose of this paper is to present an alternative approach to the Fluent 
Calculus, where we start off from the Situation Calculus in the version where 
successor state axioms are used as means to solve the representational aspect of 
the Frame Problem [13]. We illustrate how the Fluent Calculus can be viewed as 
the result of gradually improving this approach in view of the inferential aspect 
but without losing its representational merits. The key is to gradually apply the 
principle of reification, which means to use terms instead of atoms as the formal 
denotation of statements. Along the path leading from successor state axioms 
to the Fluent Calculus lies an intermediate approach, namely, the alternative 
formulation of successor state axioms described by [9], in which atomic fluent 
formulas are reified. This alternative design inherits the representational advan- 
tages and additionally addresses the inferential Frame Problem. Yet it does so 
only under the severe restriction that complete knowledge of the values of the 
relevant fluents in the initial situation is available. The Fluent Calculus can then 
be viewed as a further improvement in that it overcomes this restriction by carry- 
ing farther the principle of reification to conjunctions of fluents. In the following 
section we illustrate by means of examples how successor state axioms can thus 
be reified to what we call state update axioms. In Section 3 we then present a 
fully mechanic method to derive state update axioms from effect specifications 
with arbitrary first-order condition. One restriction turns out necessary for this 
method to be correct, namely, that actions do not have so-called open effects.^ 
In Section 4, we will briefly show how to design state update axioms for actions 
with such effects. 

Viewed in the way we pursue in this paper, the Fluent Calculus presents itself 
as the result of a successful attempt to cope with the inferential Frame Problem, 
starting off from successor state axioms as a solution to the representational 
aspect. Our hope is that this alternative access clarifies the role of this axioma- 
tization paradigm in relation to the most popular approach and helps enhancing 
its acceptance. Following the new motivation it should become clearer that the 
Fluent Calculus provides an expressive axiomatization technique, in the setting 
of classical logic, which altogether avoids non-effect axioms and at the same time 
successfully copes with the inferential aspect of the Frame Problem. 



^ This concept is best explained by an example. This axiom specifies an open effect: 
Vx,y,s. Bomb{x) /\ Nearby{x,y, s) D Destroyed{y,Do{Explodes{x),s)). Even after 
instantiating the action expression Explodes(x) and the situation term s, the effect 
literal still carries a variable, y, so that the action may have infinitely many effects. 
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2 Prom Situation Calculus to Fluent Calculus 

2.1 EYom Successor State Axioms (I) ... 

Reasoning about actions is inherently concerned with change: Properties expire, 
objects come into being and cease to exist, true statements about the state 
of affairs at some time may be entirely wrong at another time. The first and 
fundamental challenge of formalizing reasoning about actions is therefore to 
account for the fact that most properties in the real world possess just a limited 
period of validity. This unstable nature of properties which vary in the course of 
time has led to calling them “fiuents.” In order to account for fiuents changing 
their truth values in the course of time as consequences of actions, the Situation 
Calculus paradigm [12] is to attach a situation argument to each fiuent, thus 
limiting its range of validity to a specific situation. The performance of an action 
then brings about a new situation in which certain fiuents may no longer hold. 

As an example which will be used throughout the paper, we will formalize 
the reasoning that led to the resolution of the following little mystery: 

A reliable witness reported that the murderer poured some milk into a cup of 
tea before offering it to his aunt. The old lady took a drink or two and then 
she suddenly fell into the armchair and died an instant later, by poisoning as 
has been diagnosed afterwards. According to the witness, the nephew had no 
opportunity to poison the tea beforehand. This proves that it was the milk 
which was poisoned and by which the victim was murdered. 

The conclusion in this story is obviously based on some general commonsense 
knowledge of poisoned substances and the way they may affect people’s health. 
To begin with, let us formalize by means of the Situation Calculus the rel- 
evant piece of knowledge that mixing a poisoned substance into another one 
causes the latter to be poisoned as well. To this end, we use the binary predicate 
Poisoned{x,s) representing the fact that x is poisoned in situation s, the ac- 
tion term Mix{p,x,y) denoting the action carried out by agent p of mixing x 
into y, and the binary function Do{a,s) which denotes the situation to which 
leads the performance of action a in situation s.^ With this signature and its 
semantics the following axiom formalizes the fact that if x is poisoned in sit- 
uation s then y, too, is poisoned in the situation that obtains when someone 
mixes x into y : 

Poisoned{x,s) D Poisoned{y, Do{Mix{p, x,y), s)) (1) 

The second piece of commonsense knowledge relevant to our example concerns 
the effect of drinking poisoned liquids. Let Alive{x, s) represent the property 
of X being alive in situation s, and let the action term Drink {p, x) denote that 

^ A word on the notation: Predicate and function symbols, including constants, start 
with a capital letter whereas variables are in lower case, sometimes with sub- or 
superscripts. Free variables in formulas are assumed universally quantihed. 
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p drinks x. Then the following axiom encodes the fact that if x is poisoned 
then person p ceases to being among the livings after she had drunk x : 

Alive{p,s) A Poisoned{x,s) D ^Alive{p, Do{Drink{p, x), s)) (2) 

These two effect axioms, however, do not suffice to solve the mystery due to 
the Frame Problem, which has been uncovered as early as in [12]. To see why, 
let So be a constant by which we denote the initial situation, and consider the 
assertion. 



^Poisoned {Tea, So) A Alive{Nephew , So) A Alive{Aunt , So) (3) 

Even with Poisoned{Milk, So) added, ^Alive{Aunt, S2) does not yet follow 
(where S2 = Do {Drink {Aunt, Tea), Do {Mix {Nephew , Milk, Tea), So))), because 
Alive{Aunt, Do{Mix{Nephew , Milk, Tea), So)) is needed for axiom (2) to apply 
but cannot be concluded. In order to obtain this and other intuitively expected 
conclusions, a number of non-effect axioms (or “frame axioms” ) need to be sup- 
plied, like the following, which says that people survive the mixing of substances: 

Alive{x,s) D Alive{x, Do{Mix{p,y, z), s)) (4) 

Now, the Frame Problem is concerned with the problems that arise from the 
apparent need for non-effect axioms like (4). Actually there are two aspects of 
this famous problem: The representational Frame Problem is concerned with 
the proliferation of all the many frame axioms. The inferential Frame Problem 
describes the computational difficulties raised by the presence of many non-effect 
axioms when it comes to making inferences on the basis of an axiomatization: 
To derive the consequences of a sequence of actions it is necessary to carry, one- 
by-one and almost all the time using non-effect axioms, each property through 
each intermediate situation. 

With regard to the representational aspect of the Frame Problem, successor 
state axioms [13] provide a solution which is optimal in a certain sense, namely, 
in that it requires no extra frame axioms at all. The key idea is to combine, in a 
clear elaborated fashion, several effect axioms into a single one. The result, more 
complex than simple effect axioms like (1) and (2) but still mentioning solely 
effects, is designed in such a clever way that it implicitly contains sufficient 
information also about non-changes of ffuents. 

The procedure by which these axioms are set up is the following. Suppose 
F( X ) is among the ffuents one is interested in. On the assumption that a fixed, 
finite set of actions is considered relevant, it should be possible to specify with a 
single formula 7^ ( x , a, s) all circumstances by which T( x ) would be caused 
to become true. That is to say, 'yp{x ,a,s) describes all actions a and condi- 
tions relative to situation s so that T( x ) is a positive effect of performing a 
in s. For example, among the actions we considered above there is one, and 
only one, by which the fluent Poisoned{x) is made true, namely, mixing some 
poisonous y into x. Hence an adequate definition of 7poisoned(^’ 
formula 3p, y[a = Mix{p, y, x) A Poisoned{y, s)] . 
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A dual formula, ( x , a, s) , defines the circumstances by which fiuent F{x) 
is caused to become false. In our example we consider no way to ‘decontaminate’ 
a substance, which is why 7poisoned(®’ should be equated with a logical 
contradiction. For our second fiuent, Alive{x), the situation is just the other way 
round: While s) is false for any instance, the appropriate definition 

of 7j4K„e(a^5 o, s) is 3y[a = Drink{x,y) A Alive{x,s) A Poisoned {y,s)]. 

On the basis of suitable definitions for both jp and yjj , a complete account 
can be given of how the truth value of fiuent A in a new situation depends on 
the old one, namely, 

F{ X , Do{a, s)) = 7^ ( X , a, s) V [ F( X , s) A 7^ ( X , a, s) ] ( 5 ) 

This is the general form of successor state axiomsA It says that the fiuent F 
holds in a new situation if, and only if, it is either a positive effect of the action 
being performed, or it was already true and the circumstances were not such that 
the ffuent had to become false. Notice that both 7+ and 7“ talk exclusively 
about effects (positive and negative), not at all about non-effects. Nonetheless, 
by virtue of being bi-conditional, a successor state axiom implicitly contains all 
the information needed to entail any non-change of the ffuent in question. For 
whenever neither 'jp{x , a, s) nor 7^(x,o, s) is true, then ( 5 ) rewrites to the 
simple equivalence F( x , Do{a, s)) = F{x,s). 

The two successor state axioms for our example domain, given the respective 
formulas 7 from above, are 

Poisoned{x, Do{a,s)) = 3 p,y[a = Mix{p,y,x) A Poisoned{y,s)] 

V Poisoned{x,s) 



and 



Alive{x, Do{a,s)) = 

Alive{x,s) A -^Byla = Drink{x,y) A Alive{x,s) A Poisoned{y,s)] 

The latter, for instance, suffices to conclude that Alive{Aunt, So) is not affected 
by the action Mix {Nephew, Milk, Tea) — assuming “unique names” for actions, 
i.e., Mix{p' ,x' ,y') yf Drink{x,y). Thus we can spare the frame axiom ( 4 ). 

By specifying the effects of actions in form of successor state axioms it is 
possible to avoid frame axioms altogether. These axioms thus provide us with 
an in a certain sense optimal solution to the Frame Problem, as far as the 
representational aspect is concerned. 



2.2 . . . via Successor State Axioms (II) . . . 

While successor state axioms are a good way to overcome the representational 
Frame Problem since no frame axioms at all are required, the inferential aspect 

® For the sake of clarity we ignore the concept of action precondition in this paper, as 
it is irrelevant for our discussion (see Section 4). 
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is fully present. In order to derive which fluents hold and which do not after a 
sequence of actions, it is still necessary to carry, one-by-one, each fluent through 
each intermediate situation by separate instances of successor state axioms. In 
this respect nothing seems gained by incorporating knowledge of non-effects in 
complex effect axioms instead of using explicit frame axioms. 

However, it has been shown in [9] that by formulating successor state axioms 
in a way that is somehow dual to the scheme (5), the inferential aspect can be 
addressed at least to a certain extent. Central to this alternative is the repre- 
sentation technique of reification. It means that properties like Poisoned (x) are 
formally modeled as terms, in other words as objects, in logical axiomatizations. 
This allows for a more flexible handling of these properties within first-order 
logic. Let, to this end, Holds{f, s) be a binary predicate representing the fact 
that in situation s holds the fluent /, now formally a term but still meaning a 
proposition. 

The key to the alternative form of successor state axioms is to devise one 
for each action, and not for each fluent, which gives a complete account of the 
positive and negative effects of that action. Suppose x ) is an action, then 
it should be possible to specify with a single formula 6^{x,f,s) the neces- 
sary and sufficient conditions on / and s so that / is a positive effect of 
performing x ) in s. In our running example, the appropriate definition 
of f, s), say, is [/ = Poisoned{y, s)] A Holds{Poisoned{x),s), while 

^Drink(P’ X, /, s) should be equated with a logical contradiction since Drink{p, x) 
has no relevant positive effect. A dual formula, 6^{x , f,s), defines the necessary 
and sufficient conditions on / and s so that / is a negative effect of perform- 
ing A( X ) in s. For instance, 6]^^^ {p, x,y, f, s) should be false in any case, while 
x, /, s) is suitably described by [/ = Alive{p)] A Holds{Alive{p), s) A 
Holds{Poisoned{x), s). 

On the basis of and , a complete account can be given of which fluents 
hold in situations reached by performing A( x ) , namely, 

Holds{f,Do{A{x),s)) = 6;^{x,f,s)\/[Holds{f,s)A^Sj{x,f,s)] (8) 

That is to say, the fluents which hold after performing the action A{ x ) are 
exactly those which are among the positive effects or which held before and are 
not among the negative effects. The reader may contrast this scheme with (5) 
and in particular observe the reversed roles of fluents and actions.^ 

Given the formulas .5+^^ (p, x,y,f,s), 5^^^ {p, x,y,f,s), (p, x,f,s), and 

^DrinkiP’^’ respectively, from above, we thus obtain these two successor 
state axioms of type (II) : 



Holds{f, Do{Mix{p, x, y), s)) 



/ = Poisoned (y) A Holds {Poisoned (x), s) 
V Holds if,s) 



(9) 



^ Much like [13] roots in the axiomatization technique of [6], the foundations for the 
alternative form of successor state axioms were laid in [10]. 
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and 

Holds{f, Do{Drink{p, x), s)) = 

Holds{f, s) A “' [ / = Alive{p) A Holds{Alive{p) , s) (10) 

A Holds{Poisoned{x), s) ] 

Notice that as before non-effects are not explicitly mentioned and no additional 
frame axioms are required, so the representational aspect of the Frame Problem 
is addressed with the alternative notion of successor state axioms just as well. 
The inferential advantage of the alternative design shows if we represent the 
collection of fluents that are true in a situation s by equating the atomic formula 
Holds{f, s) with the conditions on / to hold in s. The following formula, for 
instance, constitutes a suitable description of the initial situation in our example: 

Holds{f,So) = 

/ = Alive{Nephew) V / = Alive{Aunt) V / = Poisoned (Milk) 

The crucial feature of this formula is that the situation argument. So , occurs only 
once. With this representational trick it becomes possible to obtain a complete 
description of a successor situation in one go, that is, by singular application 
of a successor state axiom. To see why, consider the axiom which specifies the 
effects of mixing, (9). If we substitute p, x, and y by Nephew, Milk, and Tea, 
respectively, and s by So, then we can replace the sub- formula Holds{f, So) of 
the resulting instance by the equivalent disjunction as given in axiom (11). So 
doing yields the formula, 

Holds{f, Do{Mix{Nephew , Milk, Tea), So)) = 

f = Poisoned(Tea) A Holds {Poisoned (Milk), So) 

V / = Alive{Nephew) V / = Alive{Aunt) V / = Poisoned {Milk) 

which all at once provides a complete description of the successor situation. 
Given suitable axioms for equality, the above can be simplified, with the aid 
of (11), to 

Holds{f, Do{Mix{Nephew , Milk, Tea), So)) = 

f = Poisoned{Tea) V f — Alive{Nephew) 

V / = Alive{Aunt) V / = Poisoned {Milk) 

The reader may verify that we can likewise infer the result of Drink {Aunt, Tea) 
in the new situation by applying the appropriate instance of successor state 
axiom (10), which, after simplification, yields 

Holds{f, Do{Drink{Aunt, Tea), Do{Mix{Nephew, Milk, Tea), So))) = 

f = Poisoned{Tea) V / = Alive{Nephew) V / = Poisoned {Milk) 

At first glance it seems that the alternative design of successor state axioms 
provides an overall satisfactory solution to both aspects of the Frame Problem. 
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No frame axioms at all are needed, and one instance of a single successor state 
axiom suffices to carry over to the next situation all unchanged fluents. However, 
the proposed method of inference relies on the very strong assumption that we 
can supply a complete account of what does and what does not hold in the 
initial situation. Formula (11) provides such a complete specification, because it 
says that any fluent is necessarily false in So which does not occur to the right 
of the equivalence symbol. Unfortunately it is impossible to formulate partial 
knowledge of the initial state of affairs in a similarly advantageous fashion. Of 
course one can start with an incomplete specification like, for instance, 

Holds{f, So) C [f = Alive{Nephew) V / = Alive{Aunt)] A / ^ Poisoned(Tea) 

which mirrors the incomplete description we used earlier (c.f. formula (3)). But 
then the elegant inference step from above, where we have simply replaced a 
sub- formula by an equivalent, is no longer feasible. In this case one is in no way 
better off with the alternative notion of successor state axioms; again separate 
instances need to be applied, one for each fluent, in order to deduce what holds 
in a successor situation. 



2.3 ... to State Update Axioms 

So far we have used reification to denote single properties by terms. The ‘meta’- 
predicate Holds has been introduced which relates a reified fluent to a situ- 
ation term, thus indicating whether the corresponding property is true in the 
associated situation. When formalizing collected information about a particular 
situation S as to which fluents are known to hold in it, the various correspond- 
ing atoms Holds {f i, S) are conjuncted using the standard logical connectives. 
We have seen how the inferential aspect of the Frame Problem is addressed if 
this is carried out in a certain way, namely, by equating Holds{f, s) with some 
suitable formula The effects of an action a can then be specified in terms 
of how iF modifies to some formula 'P' such that Holds{f, Do{a, s)) = . 

We have also seen, however, that this representation technique is still not suf- 
ficiently flexible in that it is impossible to construct a first-order formula iF 
so that Holds{f, So) ='P provides a correct incomplete specification of So- 
Yet it is possible to circumvent this drawback by carrying farther the prin- 
ciple of reification, to the extent that not only single fluents but also their 
conjunctions are formally treated as terms. Required to this end is a binary 
function which to a certain extent reifies the logical conjunction. This function 
shall be denoted by the symbol “o” and written in infix notation, so that, 
for instance, the term Alive{N ephew) o Poisoned{Milk) is the reified version of 
Alive{ Nephew) A Poisoned (Milk). The use of the function “o” is the character- 
istic feature of axiomatizations which follow the paradigm of Fluent Calculus. 

The union of all relevant fluents that hold in a situation is called the state 
(of the world) in that situation. Recall that a situation is characterized by the 
sequence of actions that led to it. While the world possibly exhibits the very same 
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state in different situations,® the world is in a unique state in each situation. A 
function denoted by State{s) shall relate situations s to the corresponding 
states, which are reified collections of fluents. 

Modeling entire states as terms allows the use of variables to express mere 
partial information about a situation. The following, for instance, is a correct 
incomplete account of the initial situation So in our mystery story (c.f. (3)): 

3z [ State(So) = Alive{Nephew) o Alive{Aunt) o z 
A V 2 ;'. .2 ^ Poisoned {Tea) o z' ] 

That is to say, of the initial state it is known that both Alive{N ephew) and 
Alive{Aunt) are true and that possibly some other facts 2 hold, too — with the 
restriction that z must not include Pozsoned( Tea), of which we know it is false. 

The binary function “ o ” needs to inherit from the logical conjunction an im- 
portant property. Namely, the order is irrelevant in which conjuncts are given. 
Formally, order ignorance is ensured by stipulating associativity and commuta- 
tivity, that is, Vx, y, z. {x o y) o z = x o {y o z) and Mx, y.x o y = y o x. It is 
convenient to also reify the empty conjunction, a logical tautology, by a constant 
usually denoted 0 and which satisfies Vx. xo0 = x. The three equational axioms, 
jointly abbreviated ACl, in conjunction with the standard axioms of equality 
entail the equivalence of two state terms whenever they are built up from an 
identical collection of reified fluents.® In addition, denials of equalities, such as 
in the second part of formula (12), need to be derivable. This requires an exten- 
sion of the standard assumption of “unique names” for fluents to uniqueness of 
states, denoted by EUNA (see, e.g., [8,14]). 

The assertion that some fluent / holds (resp. does not hold) in some situa- 
tion s can now be formalized by 3z. State{s) = foz (resp. Vz. State{s) ^ foz). 
This allows to reintroduce the Holds predicate, now, however, not as a primitive 
notion but as a derived concept: 

Holds{f,s) = 3z. State {s) = f o z (13) 

In this way, any Situation Calculus assertion about situations can be directly 
transferred to a formula of the Fluent Calculus. For instance, the (quite arbi- 
trary) Situation Calculus formula 3x. Poisoned{x, So) V -^Alive{Aunt , So) reads 
3x. Holds{Poisoned{x), So) V ^ Holds {Alive{ Aunt), So) in the Fluent Calculus. 
We will use the notation HOLDS (E) to denote the formula that results from 
transforming a Situation Calculus formula T into the reified version using the 
Holds predicate. 

® If, for example, the tea was already poisoned initially, then the state of the world 
prior to and after Mix{Nephew, Milk, Tea) would have been the same — in terms of 
which of the two liquids are poisoned and who of our protagonists is alive. 

® The reader may wonder why function “o” is not expected to be idempotent, i.e., 
Vx. X o X = X, which is yet another property of logical conjunction. The (subtle) 
reason for this is given below. 
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Knowledge of effects of actions is formalized in terms of specifying how a 
current state modifies when moving on to a next situation. The universal form 
of what we call state update axiom is 

Z\(s) D r [State {Do {A, s)), State (s)] (14) 



where Z\(s) states conditions on s, or rather on the corresponding state, under 
which the successor state is obtained by modifying the current state according 
to T. Typically, condition Z\(s) is a compound formula consisting of Holds (f,s) 
atoms, as defined with the foundational axiom (13). The component T defines 
the way the state in situation s modifies according to the effects of the action 
under consideration. Actions may initiate and terminate properties. We will 
discuss the designing of T for these two cases in turn. 

If an action has a positive effect, then the fluent which becomes true simply 
needs to be coupled onto the state term. An example is the following axiomati- 
zation of the (conditional) effect of mixing a liquid into a second one: 



Holds{Poisoned{x), s) A ~^Holds{Poisoned{y), s) D 

State{Do{Mix{p,x,y), s)) = State{s) o Poisoned{y) 
~^Holds{Poisoned{x), s) V Holds{Poisoned{y), s) D 
State{Do{Mix{p,x,y),s)) = State{s) 



That is to say, if x is poisoned and y is not, then the new state is obtained 
from the predecessor just by adding the fluent Poisoned{y), else nothing changes 
at all and so the two states are identical. Notice that neither of the two state 
update axioms mentions any non-effects. 

If we substitute, in the two axioms (15), p, x, and y by Nephew, Milk, 
and Tea, respectively, and s by Sq, then we can replace the term State{So) 
in both resulting instances by the equal term as given in axiom (12). So doing 
yields. 



3z [ Holds {Poisoned (Milk), So) A ^ Holds {Poisoned (Tea), So) A 
State {Do {Mix {N ephew , Milk, Tea), So)) 

= Alive{Nephew) o Alive{Aunt) o zo Poisoned{Tea) 
A ~^Holds{Poisoned{Milk), So) \/ Holds{Poisoned{Tea), So) A 
State{D o{Mix{N ephew , Milk, Tea), So)) 

= Alive{Nephew) o Alive{Aunt) o z 
A Vz'. z ^ Poisoned {Tea) o z' ] 
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which implies, using the abbreviation S'! = Do{Mix{Nephew , Milk, Tea), So) 
and the correspondence (13) along with axioms for equality and assertion (12), 

3z [ Holds {Poisoned (Milk), So) D 

State(Si) = Alive{Nephew) o Alive{Aunt) 

o Poisoned (Milk) o Poisoned(Tea) o z 
/\ ->Holds{Poisoned{Milk), So) 

State(Si) = Alive{Nephew) o Alive{Aunt) o z 
A ~‘Holds{Poisoned{Tea), Si) ] 

In this way we have obtained from an incomplete initial specification a still 
partial description of the successor state, which includes the unaffected fluents 
Alive{Nephew) and Alive{Aunt). These properties thus survived the application 
of the effects axioms without the need to be carried over, one-by-one, by separate 
application of axioms. 

If an action has a negative effect, then the fluent / which becomes false 
needs to be withdrawn from the current state State {s). The schematic equation 
State{Do{A, s)) ° / = State{s) serves this purpose. Incidentally, this scheme is 
the sole reason for not stipulating that “o” be idempotent. For otherwise the 
equation State{Do{A, s)) o f = State{s) would be satisfied if State {Do {A, s)) 
contained / . Hence this equation would not guarantee that / becomes false. 
Vital for our scheme is also to ensure that state terms do not contain any fluent 
twice or more, i.e.. 



Vs, X, z. State{s) = xo x o z D x = ^ (16) 

These preparatory remarks lead us to the following axiomatization of the 
(conditional) effect of drinking: 

Holds{Alive{p) o Poisoned{x), s) D 

State{Do{Drink{p, a;), s)) o Alive{p) = State{s) 
-^Holds{Alive{p),s) V ~^Holds{Poisoned{x),s) D 
State{Do{Drink{p,x),s)) = State{s) 

That is to say, if p is alive and x is poisoned, then the new state is obtained 
from the predecessor just by terminating Alive{p), else nothing changes at all.^ 
Applying the two axioms (17) to what we have derived about the state in 
situation Si yields, setting S 2 = Do {Drink {Aunt, Tea), Si) and performing 



Actions may of course have both positive and negative effects at the same time, in 
which case the component T of a state update axiom combines the schemes for 
initiating and terminating fluents. This general case is dealt with in Section 3. 
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straightforward simplifications, 

3z [ Holds {Poisoned (Milk), So) 

State{S 2 ) o Alive{Aunt) = Alive{Nephew) o Alive{Aunt) 

o Poisoned {Milk) o Poisoned {Tea) o z 

A ~‘Holds{Poisoned{Milk), So) 

State{S 2 ) = Alive{Nephew) o Alive{Aunt) o z ] 

This partial description® of the successor state again includes every persistent 
fluent without having applied separate deduction steps for each. The Fluent 
Calculus thus provides a solution to both the representational and the inferen- 
tial aspect of the Frame Problem which is capable of dealing with incomplete 
knowledge about states. 

3 The General Method 

Having illustrated the design and use of state update axioms by example, in this 
section we will present a general, fully mechanic procedure by which is generated 
a suitable set of state update axioms from a given collection of Situation Calculus 
effect axioms, like (1) and (2). As indicated in the introduction, we will only 
consider actions without open effects (c.f. Footnote 1). This is reflected in the 
assumption that each positive effect specification be of the following form, where 
A denotes an action and F a fluent: 

3 ■P'(y .^o(d(x),s)) (18) 

Here, e is a first-order formula whose free variables are among x , s; and y 
contains only variables from x . Notice that it is the very last restriction which 
ensures that the effect specification does not describe what is called an open 
effect: Except for the situation term, all arguments of the effect F are bound by 
the action term A( x ). Likewise, negative effect specifications are of the form 

~'F{y ,Do{A{yi),s)) (19) 

where again e is a first-order formula whose free variables are among x , s and 
where y contains only variables from x We assume that a given set S of 
effect axioms is consistent in that for all A and F the unique names assumption 
entails ->3 x , s [ ^( x , s) A ^( x , s) ] . 

® which by the way, since State{S 2 ) = Alive{Nephew) o Alive{Aunt) o 2 implies that 
Holds{Alive{Aunt), S 2 ), leads directly to the resolution of the murder mystery: Along 
with the statement of the witness, -iHolds{Alive{Aunt), S 2 ), the formula above log- 
ically entails the explanation that Holds{Poisoned{Milk) , So) . 

® Our two effect axioms at the beginning of Section 2.1 fit this scheme, namely, 
by equating J/, s) with Poisoned{x,s) and £DHnk,Ai,ve{P,x, s) with 

Alive{p, s) A Poisoned{x, s). 
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Fundamental for any attempt to solve the Frame Problem is the assumption 
that a given set of effect axioms is complete in the sense that it specifies all 
relevant effects of all involved actionsd° Our concern, therefore, is to design state 
update axioms for a given set of effect specifications which suitably reflect the 
completeness assumption. The following instance of scheme (14) is the general 
form of state update axioms for deterministic actions with only direct effects: 

Z\(s) D State{Do{A, s)) o = State(s) o 

where 'd~ are the negative effects and the positive effects, respectively, of 
action A under condition Z\(s). The main challenge for the design of these 
state update axioms is to make sure that condition A is strong enough for the 
equation in the consequent to be sound. Neither must '!?“'■ include a fluent that 
already holds in situation s (for this would contradict the foundational axiom 
about multiple occurrences, (16)), nor should specify a negative effect that 
is already false in s (for then EUNA implies that the equation be false). This is 
the motivation behind step 1 and 2 of the procedure below. The final and main 
step 3 reflects the fact that actions with conditional effects require more than 
one state update axiom, each applying in different contexts: 

1. Rewrite to ^( x , s) A ^F( y , s) D F{y , Do{A{x), s)) each positive ef- 
fect axiom of the form (18). 

2. Similarly, rewrite to ^( x , s) A F( y , s) D ^F{y , Do{A{x), s)) each 
negative effect axiom of the form (19). 

3. For each action A, let the following n > 0 axioms be all effect axioms thus 
rewritten (positive and negative) concerning A: 

£i(x, s) D Fi{yi,Do{A{-x),s)), ..., £m(x, s) D Fm{ym,Do{A{x),s)) 
£m+i (^j £) A (y^T^^i , Z)o(R(x), s)), ..., 

£„(x, s) D ^Fn{yn,Do{A{x),s)) 

Then, for any pair of subsets J+ C {1, . . . , m}, J_ C {m -I- 1, . . . , n} (in- 
cluding the empty ones) introduce the following state update axiom: 

AiGl+UI“ ®)) ^ l\j^I+\JI- ^OLDS{-^Ej{x, s)) 

D State{Do{A{x), s)) o = State {s)ot}^+ 



where is the term Fi o . . . o Fk if {Fi, . . . , Fk} = {Fi{ y i) : i G I~} 
and, similarly, is the term Fi o . . . o Fk if {Fi, . . . , Fk} = {Fi( y i) : 

i G 1+}}^ 

Step 3 blindly considers all combinations of positive and negative effects. Some of 
the state update axiom thus obtained may have inconsistent antecedent, in which 

If actions have additional, indirect effects, then this gives rise to the so-called Ram- 
ification Problem; see Section 4. 

Thus contains the negative effects and the positive effects specified in the 
update axiom. If either set is empty then the respective term is the unit element, 0. 
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case they can be removed. To illustrate the interaction of context-dependent 
positive and negative effects, let us apply our procedure to these two effect 
axioms: 

Loaded{s) D D ead {Do {Shoot, s)) 

true D Loaded{Do {Shoot, s)) 

After rewriting according to steps 1 and 2, step 3 produces four state update 
axioms, viz. 

^ [ Holds{Loaded, s) A -•Holds{Dead, s) ] A -> [ true A Holds{Loaded, s) ] 

D State {Do {Shoot, s)) o% — State{s) o 0 
^ [ Holds{Loaded, s) A ~^Holds{Dead, s) ] A true A Holds{Loaded, s) 

D State{Do{Shoot, s)) o Loaded = State{s) o 0 
H olds {Loaded , s) A ~~Holds{Dead, s) A -> [ true A H olds {Loaded , s) ] 

D State {Do {Shoot, s))o% — State{s) o Dead 
H olds {Loaded , s) A ~~Holds{Dead, s) A true A H olds {Loaded , s) 

D State{Do{Shoot, s)) o Loaded = State{s) o Dead 

Logical simplification of the premises of the topmost two axioms yields 

~^Holds{Loaded,s) D State{Do{Shoot,s)) = State{s) 
Holds{Dead, s) A Holds{Loaded, s) D State {Do {Shoot , s)) o Loaded = State{s) 

The third axiom can be abandoned because of an inconsistent antecedent, while 
the fourth axiom simplifies to 

Holds{Loaded, s) /\ —~Holds{Dead, s) D 

State{Do {Shoot, s)) o Loaded = State{s) o Dead 

(The interested reader may verify that applying the general procedure to our 
effect axioms (1) and (2) yields four axioms which, after straightforward simpli- 
fication, turn out to be (15) and (17), respectively.) 

The following primary theorem for the Fluent Calculus shows that the re- 
sulting set of state update axioms correctly reflects the effect axioms if the 
fundamental completeness assumption is made. 

Theorem 1. Consider a finite set £ of effect axioms which complies with the 
assumption of consistency, and let SUA be the set of state update axioms gen- 
erated from S . Suppose M. is a model of SUA U {{13), {16)} U EUNA,^^ and 
consider a fluent term F{t), an action term A{p), and a situation term a. 
Then M h Holds{F{ r ), Do{A{ p ), a)) iff 

1. M \= e\ p{ p , a) , for the instance p{ p ,a) D F{t , Do{A{ p ),a)) of 
some axiom in £; 

2. or M. ^ Holds{F{r),a) and no instance s^p{p,a) D 
F{t , Do{A{ p ), cr)) of an axiom in £ exists such that M \= p{ p ,a) . 

Recall that EUNA, the extended unique names assumption, axiomatizes equality 
and inequality of terms with the function “ o ” . 
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4 Conclusion 

We have pursued a new motivation for the Fluent Calculus, namely, as the 
outcome of applying the principle of reification to successor state axioms. The 
resulting concept of state update axioms copes with the inferential Frame Prob- 
lem without losing the solution to the representational aspect. We have shown 
how, much like in [13], a suitable collection of these axioms can be automatically 
derived from a complete (wrt. the relevant fluents and actions) set of single effect 
axioms, provided actions have no open effects. Since state update axioms cover 
the entire change an action causes in order to solve the inferential aspect of the 
Frame Problem, their number is, in the worst case, exponentially larger than 
the number of single effect axioms. This is perfectly acceptable since actions are 
viewed as having very few effects compared to the overall number of fluents. 

Open effects can only be implicitly described in state update axioms. An 



example is the following axiom (c.f. Footnote 1): 


Bomb{x) D 






yf,y 


/ = Destroyed{y) A Holds{Nearby{x,y), s) A ^Holds{f, s) 




= 3z.w = f o z 






D State{Do{Explodes{x), s)) = zow 



in which w, the positive effects of the action, is defined rather than explicitly 
given. It lies in the nature of open effects that a suitable state update axiom 
can only implicitly describe the required update and so does no longer solve the 
inferential Frame Problem (though it still covers the representational aspect). 

The problem of action preconditions has been ignored for the sake of clarity. 
Their dealing with requires no special treatment in the Fluent Calculus since 
each Situation Calculus assertion about what holds in a situation corresponds 
directly to a Fluent Calculus assertion via the fundamental relation (13). 

The basic Fluent Calculus as investigated in this paper assumes state update 
axioms to describe all effects of an action. The solution to the Ramification 
Problem of [14], and in particular its axiomatization in the Fluent Calculus, 
furnishes a ready approach for elaborating the ideas developed in the present 
paper so as to deal with additional, indirect effects of actions. 

The version of the Fluent Calculus we arrived at in this paper differs con- 
siderably from its roots [7], e.g. in that it exploits the full expressive power of 
first-order logic. In so doing it is much closer to the variant introduced in [14], 
but still novel is the notion of state update axioms. In particular the new func- 
tion State{s) seems to lend more elegance to effect specifications and at the 
same time emphasizes the relation to the Situation Calculus. 
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Abstract. We describe an implementation of the Display Logic calculus 
for relation algebra as an Isabelle theory. Our implementation is the first 
mechanisation of any display calculus. The inference rules of Display 
Logic are coded directly as Isabelle theorems, thereby guaranteeing the 
correctness of all derivations. Our implementation generalises easily to 
handle other display calculi. It also provides a useful interactive proof 
assistant for relation algebras. 

We describe various tactics and derived rules developed for simplify- 
ing proof search, including an automatic cut-elimination procedure, and 
example theorems proved using Isabelle. We show how some relation 
algebraic theorems proved using our system can be put in the form of 
structural rules of Display Logic, facilitating later re-use. We then show 
how the implementation can be used to prove results comparing alterna- 
tive formalizations of relation algebra from a proof-theoretic perspective. 

Keywords: proof systems for relation algebra, non-classical logics, auto- 
mated deduction, display logic, description logics 



1 Introduction 

Relation algebras are extensions of Boolean algebras; whereas Boolean alge- 
bras model subsets of a given set, relation algebras model binary relations on a 
given set. Thus relation algebras have relational operations such as composition 
and converse. As each relation is itself a set (of pairs), relation algebras also 
have the Boolean operations such as intersection (conjunction) and complement 
(negation). Relation algebras form the basis of relational databases and of the 
specification and proof of correctness of programs. Recently, relation algebras 
and their extensions, Peirce algebras, have also been shown to from the basis 
of description logics [5]. Just as Boolean algebras can be studied as a logical 
system (classical propositional logic), relation algebras can also be studied in a 
logical, rather than an algebraic, fashion. In particular, relation algebras can be 
formulated using Display Logic [12], and in several other ways [16]. 

Display Logic [1] is a syntactic proof system for non-classical logic, based on 
the Gentzen sequent calculus [9] . Its advantages include a generic cut-elimination 
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theorem, which applies whenever the rules for the display calculus satisfy certain 
conditions. It is a significant general logical formalism, applicable to many logics; 
its mechanisation is therefore an important challenge for automated reasoning. 

In this paper we describe an implementation of 5RA [12], a display logic 
formulation of relation algebras, using the Isabelle theorem prover. In this im- 
plementation the rules of JRA form the axioms of an Isabelle object logic ~ 
^RA is not built upon one of the standard Isabelle object logics (as is the case, 
for example, with RALL [17]). The ease with which this display calculus can 
be implemented in Isabelle highlights the generality of both Isabelle and Dis- 
play Logic. We demonstrate how Isabelle can be used to show the relationship 
between ^RA and two other formalizations of relation algebras. 

1.1 Other mechanised proof systems for relation algebras 

RALL [17] is a theorem proving system for relation algebra, based on Isabelle. It 
uses the atomicity of relation algebras; every relation algebra can be embedded 
in an atomic relation algebra. Although RALL provides automated proof search, 
this feature is heavily dependent on the atomization of the relation algebra. It 
is still not clear to us to what extent RALL is applicable to relation algebras 
which are not themselves atomic. RALL is built upon the HOL Isabelle theory, 
whereas our implementation of 5RA is built directly upon Isabelle’s metalogic. 

RALF [2] is a graphically oriented relation algebra formula manipulation sys- 
tem and proof assistant. It contains a large number of hard-coded transformation 
rules, and the super-user can add others. However the rules do not form a formal 
calculus. Thus it does not demonstrate that the results it derives follow from any 
formalization of relation algebra. In fact it may be seen as complementing a sys- 
tem such as that described in this paper since an interesting avenue for further 
work would be to obtain the transformation rules of RALF as JRA derived 
rules, giving a rigrorous basis to RALF and a graphical front-end to ^RA. 

2 Relation Algebras 

A relation on a set [/ is a set of ordered pairs (a, b) of elements of U, ie a subset of 
UxU. Since a relation is itself a set (of ordered pairs), the operations on relations 
include the operations of Boolean algebras. Other relational operations are (as 
defined in [12]) composition Ro S = {{a, b) \ 3c. (a, c) € R and (c, b) G S}, its 
identity {(a, a) | a G U}, and converse R= {(a, b) \ {b, a) G R}. 

Relation algebras are an abstraction of the notion of binary relations on a 
set U ; reference to the actual elements of the set is abstracted away, leaving the 
relations themselves as the objects under consideration. Chin & Tarski [6] give 
a finite equational axiomatization of relation algebras. The following definition, 
equivalent to that in [6], is taken from [5] and [12]. 

A relation-type algebra TZ = {R, V, A, T, T, o, 1 ) (ie, a free algebra on 
the set of variables R with binary operators V, A and o, unary operators -> and 

and constants T, T and 1) is a relation algebra if it satisfies the following 
axioms for each r,s,t G R 
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(RAl) 


{R, V, A, - 1 , T, T) is a Boolean 


algebra 




(RA2) 


r o (s o t) = (r o s) o t 


(RA3) 


rol = r = lor 


(RA4) 


r = r 


(RA5) 


(r V s) o t = (r o t) V (s o t) 


(RAG) 

(RA8) 


w (r V s) = (^ r) V (^ s) 
r o -i(r o s) < -IS 


(RA7) 


w (r o s) = (^ s) o (^ r) 



Further operators can be defined: A < B, as used in (RA8), means 
{A f\ B) = A. Duals of 1 and o are defined: 0 is ~^ 1 , and A + B is ^{-^A o ^B). 
We call this theory of relation algebras RA. A proper relation algebra consists 
of a set of relations on a set; note that not all relation algebras are isomorphic 
to a proper relation algebra ([16], p. 444-5). 



3 Relation Algebras in Display Logic 



A number of different logical systems can be formulated using the method, or 
style, of Display Logic [1]. These include several normal modal logics [21], and 
intuitionistic logic [11]. Display Logic resembles the Gentzen sequent calculus 
LK, but with significant differences. For example, the rules for introducing the 
connective ‘V’ (on the right) in LK and Display Logic are 



Bh A, P,Q 
rh A,PVQ 



(LK- h V) 



and 



XhP,Q 
Xh PVQ 



(DL- h V) 



Whereas, in LK, F and A denote comma-separated lists of formulae, in Display 
Logic, X denotes a Display Logic structure, which can involve several structural 
operators (one of which is ‘,’)- (See [12] or [7] for a full explanation of structures 
and formulae). In Display Logic, unlike in LK, the introduced formula (here, 
Py Q) stands by itself on one side of the turnstile. However there are also rules 
(the “display postulates”) which effectively allow moving formulae from one side 
to the other. In [12] there is a Display Logic formulation of relation algebras, 
called ^RA whose structural operators are ‘P ‘E’ . As in LK, the use 

of V to stand for either ‘A’ or ‘V’ (depending on the position) reflects the duality 
between them. Likewise in ^RA stands for either ‘o’ or ‘-I-’, whose duality 
was noted above. Each structural operator stands for one, or two (depending 
on the position), formula operators. Thus I stands for truth or falsity, E for 
the identity relation or its complement, * for Boolean negation and • for the 
relational converse. The rules of ^RA are sound and complete with respect to 
the equational axiomatization of Chin & Tarski [12]. 

Like LK, Display Logic is suited to backwards search for proof of a sequent, 
provided that a sequent is provable without using the cut rule. In terms of 
backwards proof, the logical introduction rules eliminate the logical operators 
and replace them with corresponding structural operators. The cut-elimination 
theorem of LK applies in all Display Logics, provided that the rules satisfy 
certain conditions, which are relatively easily checked ([!]). Indeed the procedure 
for eliminating cuts from a proof has been automated, see §5.4. 
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Unlike LK, Display Logic has bi-directional rules, the display postulates. 
Therefore any proof search technique must avoid a naive application of these ad 
infinitum. Although this is a difficulty in developing a purely automatic proof 
search technique, this is not insurmountable, and requires further work. For 
example, the techniques described below are sufficient to obtain a decision pro- 
cedure for the classical propositional calculus (see [7], §2.6.1 for details). In any 
case it is known that the theory of relation algebras is undecidable (see [16], 
p.432). 

Note that although Display Logic is suited to backwards proof, we will always 
write a proof tree, or a rule, with the premises at the top and the conclusion at 
the bottom (except that a double line in a rule means that it is bi-directional), 
and we will refer to the order of steps in a proof as though it is done in a forward 
direction, from premises to conclusion. As an alternative to writing a rule with 
one premise V and a conclusion C separated by a horizontal line, we often write 
P C, and for a bi-directional rule we often write P C. For a full 

explanation of Display Logic and ^RA see [12]. 

4 Isabelle 

Isabelle is an interactive computer-based proof system, described in [18]. Its 
capabilities include higher-order unification and term rewriting. It is written 
in Standard ML; when it is running, the user can interact with it by entering 
further ML commands, and can program complex proof sequences in ML. Isabelle 
provides a number of basic proof steps for backwards proof (tactics), as well as 
tacticals for combining these. Isabelle also supports forward proof. 

It has a simple meta-logic, an intuitionistic higher-order logic. The user then 
has to augment this by defining an object- level logic, though normally one would 
use one of several which are packaged with the Isabelle distribution [19]. 

Thus a logical system such as ^RA can be implemented in Isabelle so that 
the only proof rules available are those of ^RA. This contrasts with an imple- 
mentation in other proof systems, where the ^RA rules would be added to the 
rules of the logic on which that proof system was based. (Of course, systems can 
be implemented in this way in Isabelle too; thus RALL [17] is implemented on 
top of the HOL Isabelle theory). 

Isabelle theorems (including the axioms of the theory) may but need not in- 
volve Isabelle’s metalogical operators. In the Isabelle implementation of Display 
Logic, an Isabelle theorem (of type thm) may be either a simple sequent of 5RA, 
or a sequent rule. Whereas a sequent of ^RA becomes an Isabelle theorem of 
the form A h T, a sequent rule becomes an Isabelle theorem containing also the 
operators ==> (implication) or == (equality). For example, the (h V) rule (shown 
in §3) appears as 



"$?X I- ?A, ?B ==> $?X I- ?A V ?B" 

in Isabelle. (The ‘?’ denotes a variable which can be instantiated, and the ‘$’ 
denotes a structural variable or constant.) The variant A h A of the (id) rule 
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(see §5.4) appears as "?A |- ?A". The bi-directional dcp rule, shown in Fig. 1, 
appears as "$?X |- $?Y == * $?Y |- * $?X". 

Isabelle can be used to show the equivalence of ^RA to the various other 
systems; in most cases this serves only to confirm proofs appearing in the lit- 
erature, and in many cases it can only provide proofs of the various inductive 
steps of a proof that requires structural induction. However it shows that 5RA 
is effective for proving the axioms or rules of the other systems. 

5 Implementation in Isabelle 

The syntax of ^RA has been implemented in Isabelle. The source files are 
presently available at http : //arp . anu. edu. au: 80/“jeremy/f iles, and further 
details may be found in [7] . The syntax distinguishes structure and formula vari- 
ables; externally, at the user level, a structure variable is preceded by The 
method used for this was taken directly from the Isabelle theory LK ([19], Ch 6). 

The ^RA rules have been implemented in Isabelle. Some particularly useful 
derived rules are shown in Fig. 1. It will be noticed that the two unary structural 
operators commute, and both distribute (in a sense) over both the binary struc- 
tural operators. These distributive rules are used, for example, in §7.1. Further, 
there is a one-directional distributive rule for over The rules blcEa and 
blcEs, in particular, are elegant results which are much easier to prove for the 
case of a proper relation algebra. 



X\- **Y 
X\-Y 



(rssS) 



X\-Y 

WTTX 



(dcp) 



A h »» T 
X\-Y 



(rbbS) 



X\-Y 
•X h »Y 



(blbl) 



*{X,Y) h Z 

*y, *x\- z 



(stcdista) 



*{X-Y) h ^ 
*X;*Y h Z 



(stscdista) 



• (X-,Y)\-Z^ 
===== (blscdista) 



{•X,»Y) h Z 
• {X,Y) h Z 



(blcdista) 



{X;Y),{X;Z)hW 

X;{Y,Z)hW 



(sccldista) 



X-,Y\- Z 

•Y-»X h 



(tagae) 



• *X\-Y 
*»X\-Y 



(bsA) 



{X-Z),{Y-Z)hW 

{X,Y)-ZhW 



(sccrdista) 



E\- X 
*E\- X 



(sEa) 



I\- X 
*I\- X 



(sla) 



•X,E\-Y 
X,E h Y 



(blcEa) 



Y'r»X,E 
Y h X,E 



(blcEs) 



Ah A Bh A 
AvBh A 



(orA) 



A h A A h B 
A h AA B 



(andS) 



Fig. 1. Selected Derived Rules 
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Of the ^RA logical introduction rules, some can be applied always (in a 
backwards proof) when the relevant formula is displayed. Others, namely (b A), 
(V h), (h o) and (+ h), are subject to the “constraint” that the structure on the 
other side of h be of a particular form. The rules andS and orA are equivalent to 
(h a) and (V h). andS and orA can be applied always when the relevant formula 
is displayed. Unfortunately the introduction rules for ‘o’ and ‘+’ have no such 
equivalents. Thus the introduction of ‘o’ on the right of h is always subject to the 
“constraint” that the structure on the left of h be of a particular form; similarly 
for introducing ‘+’ on the left. 

As in classical sequent calculus, the modus operandi for proving a sequent is 
to start the proof from the bottom, ie, from the goal, and work “backwards”. 
The first step is, where possible, to remove all the formula (logical) operators, 
using the logical introduction rules. In Display Logic (unlike classical sequent 
calculus) this requires that the formula which is to be “broken up” must first 
be displayed. As any formula can be displayed, this is not a difficulty, except in 
that it is tedious to select display postulates one-by-one. The tactics described 
in the next two subsections were first written to help in this style of proof. 



5.1 Display Tactics 

The functions described below were written to help in the process of displaying 
a chosen substructure. 

disp_tac : string -> int -> tactic replaces a subgoal by one in which a 
selected structure is displayed. The structure selected is indicated by a string 
argument, of which the first character is ‘ I ’ or to indicate the left or right 
side of the turnstile. Remaining characters are 

‘1’ or ‘r’, to indicate the left or right operand of the ‘,’ operator 
‘L’ or ‘R’, to indicate the left or right operand of the ‘;’ operator 
‘*’ or ‘O’, to indicate the operand of the ‘*’ or the ‘*’ operator 
For example, in the (sub)goal R h * S),» * Q);Y the string "-L" 

denotes the sub-structure ((*T; • * S), •*Q). Thus disp_tac "-L" performs 
the first (bottom) step of the proof example shown below, 
fdisp : string -> thm -> thm is the corresponding function for forward 
proof, displaying the selected part of the conclusion of the given theorem. 
d_b_tac : string -> (int -> tactic) -> int -> tactic 

d_b_tac str tacfn sg takes subgoal sg, displays the part selected by str (as for 
disp_tac) and applies tacfn to the transformed subgoal. It then applies the 
reverse of the display steps to each resulting subgoal. This requires that the 
subgoals produced by tacfn are in a suitable form to permit the display steps 
used to be reversed. This will be the case if tacfn only affects the displayed 
sub-structure, and leaves the rest of the sequent alone. This was the purpose 
of deriving the rules andS and orA (which satisfy this requirement) from 
ands and ora (which do not). 

For example, if we start with the subgoal R F ((*T; • * S), • * Q); Y then 
d_b_tac "-L" (rtac mrs) applies the monotonicity rule (F M) (mrs) to 
the sub-structure {*T;» * S),» * Q by the following steps (read upwards) 
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Rh {*T;»*S);Y 

(reverse of disp_tac "-L") 

i?; * • F h *T; • * S' 

(mrs) 

h (*T;.*S),**Q 

(disp_tac "-L") 

i?h ((*T;.*S),.*Q);F 

fdispfun : string -> (thm -> thm) -> thm -> thm 

is a corresponding operation for forward proof, fdispfun str thfn th displays 
the part of theorem th indicated by string str, applies transformation thfn 
to the result, then reverses the display steps used. 



5.2 Search-and-Replace Tactics 

Tactics were written to display all structures in a sequent in turn, and apply any 
appropriate transformations (selected from a given set) to the structure. These 
can be used to rewrite all substructures of a particular form, for example, to 
rewrite all occurrences of “X, (F, Z)” to F), Z”. 

glob_tac : 

(term -> int -> (int -> tactic) list) -> int -> tactic 
The first argument of glob_tac is a function actfn, which is called as actfn 
tm asp, where tm is a subterm, and asp has the value 0 or 1 to indicate that 
tm is displayed as the antecedent or succedent part of a sequent, actfn tm 
asp should return either [] or [tac/] , where, when sg is a subgoal number, 
tacf sg is a tactic. When tm is displayed as the antecedent (if asp — 0) or 
succedent (if asp =1), tacf should be a tactic which changes only tm. 

Then glob_tac actfn sg displays in turn every subterm of subgoal sg, and 
applies the tactic, if any, given by actfn tm asp, (where tm is the subterm 
displayed on the side indicated by asp) to the sequent thus produced. All 
the display steps are eventually reversed, in the same way as for d_b_tac. 
glob_tac uses a top-down strategy in that it first tests the whole structure 
with the function actfn, then, recursively, each sub-structure. If actfn returns 
a tactic function (ie, not the empty list) then the changed structure is tested 
again. 

bup_tac is like glob_tac, but uses a bottom-up strategy, displaying and testing 
the smallest sub-structures first, 
fgljfun, fbljfun : 

(term -> int -> (thm -> thm) list) -> thm -> thm 
These functions are like glob_tac and bup_tac, but for doing forward proof. 
Their first argument is again a function actfn, called as actfn tm asp. Here 
actfn tm asp should return either [] or Ithtrl , where thtr is a theorem 
transformation, ie a function of type thm -> thm, which changes only the 
term tm displayed on the side indicated by asp. 
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The first implementation of these tactics and functions was described in [7] , 
§2.5.3. A later implementation gives the user great flexibility in programming the 
search strategy, enabling him/her to specify (for example) whether to traverse 
the structure top-down or bottom-up, whether or not to attempt further changes 
to a rewritten substructure, whether or not to repeat the whole procedure when 
any substructure is rewritten, etc. This implementation is described in [8, §4.4]. 

5.3 Application of the Search-and-Replace Tactics 

We now describe some uses of the search-and-replace tactics. Provided the func- 
tion actfn, as described in §5.2, returns an action which changes only the dis- 
played subterm tm, these tactics perform local “rewrites” on subexpressions. 

Examples of the use of these tactics are (1) to eliminate ** and ••, (2) 
to use the various derived distributive laws to push * and • down, and (3) to 
eliminate logical operators, where possible, using a set of logical introduction 
rules including andS and orA (see Fig. 1) - this is only for backward proof. 

As these examples illustrate, we can achieve the power of a procedure which 
rewrites using equalities or equivalences, such as rewrite_tac and rewrite jrule 
in Isabelle and REWRITEJTAC and REWRITE_RULE in HOL, even though the JRA 
formalization does not contain any general rewriting facility. However ^RA re- 
quires more than simple rewriting due to the logical introduction rules with 
“constraints” . 

The fact that pr_intr is available only for backward proof illustrates that 
these tactics can be used with uni-directional rules, such as the logical introduc- 
tion rules, as well as with bi-directional rules. This useful aspect of the tactics 
contrasts with the more typical rewriting tactics, such as in Isabelle and HOL, 
where the rules used to rewrite an expression must be equalities. 

In [12], §4, a function r which translates structures to formulae - for example, 
to convert B; uA h uD, to Bo A \—^ D V -'C - is crucial in the proof of 
soundness of ^RA. Its implementation in a forward proof is another example of 
the use of a search-and-replace function. For this we can use some of the logical 
introduction rules, in their original form (as given in [12]), using fbljfun. Note 
that this must be done bottom up, ie, on the smallest sub-structures first, because 
only at the bottom are the sub-structures also formulae. 

5.4 Other Tactics and Methods 

idfJac In the identity rule (id) p \- p, p stands for a primitive proposition, 
not an arbitrary formula. It is, however, true that A \- A for any formula A; 
this is proved by induction (see [12], Lemma 2). The restriction to primitive 
propositions is not reflected in the Isabelle implementation, which contains the 
rule Ah A, where A is a variable standing for any formula. However the tactic 
idf_tac will convert an identity subgoal such as go (^ pV —^r) h go (^ pV ~^r) 
into three separate subgoals q h q p h p rhr. The tactic can therefore 
provide a proof, from the rule p h p, of any given instance of the general theorem 
(which is provable only by induction) that Ah A for any formula A. 
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This tactic uses a set of six derived results (one for each logical operator, of 
which two examples are shown). 



Ai h A Bih B 
Ai o Bi Ao B 



(cong_comp) 



Ah Ai 

— 'Ai I — lA 



(cong_not) 



Structural operators from logical ones It is also possible to implement the func- 
tion T in a backwards proof. We can “create” logical connectives (as we read a 
proof upwards); this can on occasions help in managing a complex proof. The 
tactic tau_tac does this. It uses variants of the logical introduction rules where 
all the premises are “structure- free” and “connective- free” , ie of the form A h X 
or X h A. Structural or logical operators appear only in their conclusions. 

The inverse of r is implemented for forward proof. This use of the search- 
and-replace method can only be done top down (using fgljfun), and it uses the 
cut rule. Not all formula operators will necessarily be changed to structural ones: 
for example, ‘o’ or ‘A’ in a succedent position and ‘-I-’ or ‘V’ in an antecedent 
position cannot be converted to structural operators. 



Flipping theorems It may be observed in Fig. 1 of [12] that many of the rules in 
the two columns are symmetric about the turnstile, for example 






X; (T; Z)hW 
(X;V);ZhW 






WhX; (V;Z) 
Wh(X;V);Z 



(arS) 



In fact most rules in the right-hand column of that figure could be derived 
from the corresponding rules in the left-hand column by “flipping” them about 
the turnstile, ie, interchanging the parts before and after h . Likewise, “flipped” 
versions were derived for those rules in Fig. 1 which use only structural operators. 
The procedure for doing this has been formalized in the theorem transformation 
functions f lip_st_p and f lip_st_c, of type thm -> thm. These use the theorem 
dcp (see Fig. 1) to swap the sides of each sequent (by putting a * in front of 
each side), use the distributive laws to push the * down as far as possible, and 
rename variables to absorb the *. The transformations require the rules to be in 
structural form (see §6.1 below). This applies generally in display logics - having 
proved one theorem or rule in structural form, you then get one “free” , with 
each part moved to the other side of the turnstile. The theorem transformations 
flip_bl_p and flip_bl_c do the same thing, but with • instead of *. These 
transform a theorem to a corresponding one in terms of the converse relations. 



Automated cut- elimination Belnap’s original proof of cut-elimination is opera- 
tional; see the proof sketch in [12, Appendix]. It was implemented as follows. 

— The rules used in a proof are stored in the nodes of a tree using an appropriate 
ML datatype, where each node represents a single proof step. Thus, given 
the original sequent, we can reproduce the original proof. 

— Functions were written in ML to turn an Isabelle proof, using the various 
compound tactics described in §5.2, into a proof consisting of single steps, 
and then to represent that proof as a tree, as above. 
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— Functions were written to allow us to rearrange the nodes of such proof 
trees to push a cut upwards until the cut-formula becomes principal in both 
premises of that cut (either due to an introduction rule or an axiom p\- p). 

— Such a cut is either eliminated completely or is converted to cuts on smaller 
formulae using a look-up table that gives a recipe of how to do so for each 
logical connective. Display logic gurantees that such a recipe always exists. 

— This procedure is applied recursively to remove all cuts from the proof tree. 

Full details are given in [7, Ch. 5]. 

One novel use of this procedure would be to convert each proof of Chin and 
Tarski [6] into a 5RA proof as follows. Simply begin to convert the proof by 
following the text. Sooner or later, the text “stores” the current reasoning as 
a lemma. At this point, we let the converted text constitute a premise of cut. 
We then continue to convert the text, until the lemma is used. At this point we 
insert a cut in the converted proof. Now applying the automated cut-elimination 
procedure will give a new, purely cut-free proof. Often, this proof is shorter, 
although, in general, it will be exponentially longer. 



6 Results from the ^RA implementation 



6.1 Theorems proved using Isabelle 



We give some examples of theorems and derived rules that have been proved 
in ^RA, using Isabelle. They show how some relation algebraic results can be 
turned into structural derived rules, ie, rules in which all the operators are struc- 
tural, and all the variables denote arbitrary structures. This (together with the 
display postulates) enables them to be reused in doing proofs in structural form. 

Chin & Tarski [6] give a number of results whose proof is far from simple. 
For example, their Theorem 2.7 and the corresponding structural rule are 



(a o 6) A c h o o ((-^ o o c) A 6) 



X;{{,X;Z),Y)hW 

{X;Y),Z^W 



Chin & Tarski [6] Theorem 2.11 and its corresponding structural rule are 



(r o s) A (t o tt) h (r o ((^ r o t) A (so ^ u))) o u 



(1/; ((.!/; W),{V; .X)));XhZ 
([/; V),{W;X)^Z 



The Dedekind rule ([17], §5.3; [2], §2.1) and its corresponding structural rule are 



(g o r) A s h (g A (so ^ r)) o (r A (-^ g o s)) 



(A, {Z;,Y));{Y,{,X;Z))hW 
{X;Y),Z^W 



All these results were proved in JRA, using Isabelle. 

Some interesting theorems were useful in various areas of [7]. Again, they 
can be proved in the form of structural rules. These show an interesting similar- 
ity between the Boolean and relational “times” operators, and and their 
identities, I and E. Firstly, we have 
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{{X;Y),E);ZhW 

{{X,,Y);I),ZhW 



(thab) 



Next we have two results which can be proved from it. 



{X,E);{Y,Z)hW 

{{X,E);Y),ZhW 



(th78sr) 



{X;I),{Y; Z)hW 
((X;I),Y);ZhW 



(th56sr) 



These results have the following corollaries, giving examples of cases where 
the Boolean and relational products of two quantities are equal. 



{X,E);{E, Z)hW 
(X,E),(E, Z)hW 



(cor78i) 



(X;I),(I; Z)hW 
(X; /);(/; Z) h W 



(cor56i) 



6.2 Proving the RA axioms 



Using Isabelle we can mechanize the formal proofs of completeness of JRA with 
respect to RA. Here, we sketch how this is done. We show, informally, how 
any result provable in RA can be proved in JRA. This completeness result is 
weaker than Theorem 6 of [12] in that we only show how to derive, in 5RA, any 
RA-equation (expressed in terms of formulae)] by contrast, [12] also deals with 
^RA-sequents involving structures whose translation into a formula inequality 
is valid in RA. 

In [12] a (Lindenbaum) relation algebra is constructed from ^RA. Here we 
consider the proof system which corresponds to RA. By the completeness the- 
orem of Birkhoff [3], discussed in [14], the proof rules for RA consist only of 
substituting terms for variables in the relevant equations (defining a Boolean 
algebra and (RA2) to (RA8)), and replacing equals by equals. 

Firstly, following [12], Theorem 6, we let A = i? mean A \- B and B \- A; 
more formally, we introduce the following rules 



A\- B Bh A 

A^B 



(eql) 



A^B 
A h R 



(eqDl) 



A=B 
R h A 



(eqD2) 



This equality was easily proved to be reflexive, symmetric and transitive. We de- 
fine A < R as (AaR) = A, as in RA. It is then proved that A < R 4==4> A h R. 

Then the RA axioms can be proved; (RA2) to (RA8) are proved using Is- 
abelle. Axiom (RAl), which says that the logical operators form a Boolean al- 
gebra, follows from the fact that ^RA contains classical propositional logic. 

In using the RA axioms we would implicitly rely on the equality operator 
= being symmetric and transitive, which we proved, as indicated above. The 
proof system for RA also uses the fact, as noted above, that equality permits 
substitution of equals. That is, suppose C[A] and C[B] are formulae such that 
C[B] is obtained from C[A] by replacing some occurrences of A by R. If we now 
prove in RA that A — B and C[A] — D then we could deduce C[B] = D by 
substituting R for A. 

To show that this can be done in ^RA we have to use induction on the 
structure of C[-]. Each induction step uses one of six derived results, of which 
two are shown below. 
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Ai=A Bi^B 
Al^Bl= A^B 



(eqc_ELnd) 



Ai = A 
^Ai=^A 



(eqc_conv) 



Alternatively, any given instance of this result could be proved with the help 
of the tactic idf_tac, described in §5.4. Details are given in [7]. 



7 Comparing Different Logical Calculi for RA Using <5RA 

In this section we describe how we used the implementation of ^RA in Isabelle 
to demonstrate the soundness of some other logical systems for relation algebra. 
(As described in §6.2, we did the same for RA; as we use RA as a “reference 
point” we treated this as a demonstration of the completeness of ^RA). 



7.1 Gordeev’s term rewriting system for RA 

Gordeev [10] has given a system NR A of rules for rewriting relational equations 
of the form F = T (where F is a relational formula and T is the unique largest 
element). A derivation of F in NR A is a sequence of permitted rewrites (“re- 
ductions”) F — > • • • — > T. Thus NR A proves equations of the form F — T. 
However it is known that every Boolean combination of equations is equivalent 
to an equation of the form F = T (see [16], pp. 435-6, pp. 439-440). Gordeev 
has shown that F is derivable in NR A iff F = T is derivable in RA. 

It is expected that details of NRA will be published in due course; the actual 
rules are not pertinent here, since we simply outline how Isabelle was used as 
the basis of a proof of the soundness of NRA. 

NRA, uses “literals” (F) and “positive terms” (A,B,C,D). Literals are of 
the shape P, -^P, F, - F, T, T, 0 or 1, for any given primitive relation 
symbol F. Positive terms are built up from literals using the binary operators. 
Any RA term is reducible to a positive term by the distributive rules which 
show how the unary operators distribute over the binary operators (see Fig. 1 
for the structural counterparts). 

In NRA there are 12 rewriting rules (NRA1)-(NRA12), but, given a formula 
F, some may only be used to rewrite F or a conjunct of F. The permitted 
rewrites, or “reductions” , are 

— F[G'] — > F[G"], where G' — > G" is an instance of any rule (NRAi) except 
for (NRA9) and (NRAIO) 

— G 1 AG 2 A. . .AG' A. . .AGfe — > G 1 AG 2 A. . .AG"A. . .AGfe where I < j < k 
and G' — > G" is an instance of (NRA9) or (NRAIO) 

We take it that F[-] above must be such that G is a ‘sub-positive-term’ of 
F[G], ie, that G is not a literal properly contained in a larger literal within F[G] 
(for example, F[G] may not be ^GW FI). This point was not spelt out in the brief 
communication [10], and we realized the need for it only in doing the proof in 
Isabelle. This gives an example of the value of a computer-based theorem prover 
in ensuring such details are not overlooked. 

The following lemma explains the NRA reduction step in terms of ^RA. 
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Lemma 1. Whenever F' — > F" in NR A, then T h F" => T \- F' in 6RA. 



Proof. For rules other than (NRA9) and (NRAIO), we show that F" h F' in 
^RA. Let F' = F[G'] and F” = F[G”], where the NRA rule is G' — > G". For 
each rule the ^RA theorem G" h G' was proved in Isabelle. 

From G" F G' we show F[G"\ F F[G'\ by induction on the structure of F[-]. 
As F[-] uses only literals and the binary operators, the inductive steps rely on 
the four ^RA-theorems cong_or, cong_and, cong_comp and congjrs (see §5.4), 
and the identity rule A\- A . 

Since F" F F', if we assume T F F” , then the (cut) rule gives T F F' . 

We now look at rules (NRA9) and (NRAIO). For these rules, where the NRA 
rule is G' — > G", we have proved the ^RA derived rule T F G" T F G' in 
Isabelle. So from this we need to show T F F[G"] T F F[G'], where G' is 
a conjunct of F[G']. This is shown by induction on the number of conjuncts in 
F[-]. The inductive step uses the following ^RA-theorem. 



T F GV 



T F GT T F G2" 



T F G2' 



T F Gl" A G2" 



T F GT A G2' 



(G910meth) 



□ 



G910meth is a theorem whose use of Isabelle’s metalogic is more complex 
than others seen so far, with nested occurrences of the metalogical operator =A. 

Lemma 1 implies that for every NRA-derivation of F there is a 5RA- 
derivation of T F F. Thus NRA is sound with respect to RA. 



7.2 A point-variable sequent calculus 

In [7], Ch. 4, we described the sequent calculus system of Maddux [15], M, for 
relation algebras, and showed how its sequents and proofs correspond to sequents 
and proofs in ^RA. We characterized the sequents in A4 which can be translated 
into ^RAby describing the pattern of relation variables and point variables in 
the Al-sequent as a graph, and showing that whether or not an Al-sequent can 
be translated into ^RA depends on the shape of the graph. 

As a sequent in Ai can often be translated to any of several sequents in ^RA, 
we showed that these several sequents are equivalent in ^RA. 

The result relating proofs in the two systems was that if an Al-sequent S can 
be translated into a ^RA-sequent S', then any Ad-proof of S can be translated 
into a ^RA-proof of S' , provided that the Ad-proof uses at most four point 
variables. This required showing how such a proof in Ad can be divided into 
parts each of which could be turned into a part of a proof in ^RA. 

As not all Ad-sequents containing at most four point variables can be trans- 
lated into ^RA, not all the intermediate stages of such a proof could be trans- 
lated into ^RA, that is, not every individual step of the Ad-proof could be 
converted into a corresponding step of a proof in ^RA. Rather, a larger portion 
of the Ad-proof had to be treated as a unit; we showed that we could always re- 
arrange the Ad-proof to get these portions of a particular form, which conformed 
to a theorem that we had proved in Isabelle for ^RA. 

This result provided a constructive proof of one direction of the result of 
Maddux ([15], Thm. 6) that the class of relation algebras is characterized by the 
Ad-sequents provable using four variables. 
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8 Conclusion 

We described an implementation, in Isabelle, of the display calculus JRA for 
relation algebras. This is the first implementation of any display calculus. We 
intend to generalise it to other display calculi in future work. Although a 5RA 
proof involved frequent and tedious use of the display postulates, we provided a 
number of tactics and other functions to make this aspect of constructing proofs 
easier. We also provided tactics to search for certain patterns and replace them 
wherever they are found, giving the power of a rewrite rule in a term rewriting 
system. Most, but not all, of the rules for introducing logical connectives could 
be used in this manner. Certain other tactics and functions, such as those which 
transform between structural and logical forms of sequents, have proved con- 
venient in proofs on occasions. There is scope for further work to devise more 
automatic techniques than those described here. 

The use of display postulates to “display” a subterm resembles the use of 
window inference [20], [13] to “focus” upon a subterm. Further work is needed to 
connect these two modes of reasoning. 

Some useful derived rules were proved, for example those showing how the 
unary operators distribute over the binary operators. We showed how the 5RA 
system can justify inferences in the (axiomatic) theory of relation algebras. Al- 
though this result appears as Theorem 6 of [12], the published proof omits most 
of the detail. The detail is provided in the Isabelle proofs, giving an example of 
the value of a mechanized theorem prover to confirm the details of proofs which 
are too voluminous to publish or even too tedious to check by hand. 

Some important (and difficult) theorems from the literature were also proved, 
for example some from [6]. Other interesting theorems were discovered and 
proved, some of which were needed for the work outlined in §7.2. 

In §7.1 we described an equational theory due to Gordeev for relation alge- 
bras, and showed how its inferences can be justified in 5RA. As mentioned there, 
we obtained a corresponding result for the sequent calculus system of Maddux. 
This showed how Isabelle could be used to demonstrate the relationship between 
different logical systems for relation algebra. 

Theorems 3.6 and 3.8 of [5] show that Peirce algebras can be embedded inside 
relation algebras in various ways. The authors also show that Peirce algebras 
form the basis of most of the common Knowledge Representation languages like 
KL-ONE [4]. Minor modifications of our system should give mechanised proof 
systems for these logics. 




278 



Jeremy E. Dawson and Rajeev Gore 



References 

1. Nuel D. Belnap, Display Logic, Journal of Philosophical Logic 11 (1982), 375-417. 

2. Rudolf Berghammer & Claudia Hattensperger, Computer-Aided Manipulation of 
Relational Expressions and Formulae Using RALE, preprint. 

3. Garrett Birkhoff, On the Structure of Abstract Algebras, Proc. Cambridge Phil. 
Soc. 31 (1935), 433-454. 

4. R.J. Brachman & J.G. Schmolze, An overview of the KL-ONE knowledge repre- 
sentation system. Cognitive Science 9(2) (1985), 171-216. 

5. Chris Brink, Katarina Britz & Renate A. Schmidt, Peirce Algebras, Formal Aspects 
of Computing 6 (1994), 339-358. 

6. Louise H. Chin & Alfred Tarski, Distributive and Modular Laws in the Arithmetic 
of Relation Algebras, University of California Publications in Mathematics, New 
Series, I (1943-1951), 341-384. 

7. Jeremy E. Dawson, Mechanised Proof Systems for Relation Algebras, Grad. Dip. 
Sci. sub-thesis. Dept of Computer Science, Australian National University. Avail- 
able at http : //arp. anu. edu. au: 80/~ jeremy/thesis . dvi 

8. Jeremy E. Dawson, Simulating Term-Rewriting in LPF and in Display Logic, sub- 
mitted. Available at http://arp.anu. edu. au:80/"'jeremy/rewr/rewr. dvi 

9. Jean H. Gallier, Logic for Computer Science : Foundations of Automatic Theorem 
Proving, Harper & Row, New York, 1986. 

10. Lev Gordeev, personal communication. 

11. Rajeev Gore, Intuitionistic Logic Redisplayed, Automated Reasoning Project TR- 
ARP-1-95, ANU, 1995. 

12. Rajeev Gore, Cut-free Display Calculi for Relation Algebras, Computer Sci- 
ence Logic, Lecture Notes in Computer Science 1249 (1997), 198-210. (or see 
http : //arp. anu. edu. au/"'rpg/publications .html). 

13. Jim Grundy, Transformational Hierarchical Reasoning, The Computer Journal 39 
(1996), 291-302. 

14. Gerard Huet & Derek C. Oppen, Equations and Rewrite Rules - A Survey, in 
Formal Languages: Perspectives and Open Problems, R.V. Book (ed). Academic 
Press (1980), 349-405. 

15. Roger D. Maddux, A Sequent Calculus for Relation Algebras, Annals of Pure and 
Applied Logic 25 (1983), 73-101. 

16. Roger D. Maddux, The Origin of Relation Algebras in the Development and Ax- 
iomatization of the Calculus of Relations, Studia Logica 50 (1991), 421-455. 

17. David von Oheimb & Thomas F. Gritzner, RALL: Machine-supported proofs for 
Relation Algebra, Proceedings of CADE-14, Lecture Notes in Computer Science 
1249 (1997), 380-394. 

18. Lawrence C. Paulson, The Isabelle Reference Manual, Computer Laboratory, Uni- 
versity of Cambridge, 1995. 

19. Lawrence C. Paulson, Isabelle’s Object-Logics, Computer Laboratory, University 
of Cambridge, 1995. 

20. Peter J. Robinson & John Staples, Formalizing a Hierarchical Structure of Practical 
Mathematical Reasoning, J. Logic & Computation, 3 (1993), 47-61. 

21. Heinrich Wansing, Sequent Calculi for Normal Modal Propositional Logics, Journal 
of Logic and Computation 4 (1994), 124-142. 




Relative Similarity Logics are Decidable: 
Reduction to FO^ with Equality* 



Stephane Demri and Beata Konikowska 

1 Laboratoire LEIBNIZ - C.N.R.S. 

^ Institute of Computer Science, Polish Academy of Sciences 
Warszawa, Poland 



Abstract. We show the decidability of the satisfiability problem for rel- 
ative similarity logics that allow classihcation of objects in presence of 
incomplete information. As a side-effect, we obtain a finite model prop- 
erty for such similarity logics. The proof technique consists of reductions 
into the satisfiability problem for the decidable fragment FO^ with equal- 
ity from classical logic. Although the reductions stem from the standard 
translation from modal logic into classical logic, our original approach 
(for instance handling nominals for atomic properties and decomposition 
in terms of components encoded in the reduction) can be generalized to 
a larger class of relative logics, opening ground for further investigations. 



1 Introduction 

Background. Classification of objects in presence of incomplete information has 
been long recognized as an issue of concern for various AI problems that deal with 
commonsense knowledge as well as scientific and engineering knowledge (expert 
systems, image recognition, knowledge bases and so on). Similarity -sometimes 
termed ’’weak equivalence”- provides a basic tool each time when we classify 
objects with respect to their properties. There exist several formal systems cap- 
turing the notion of similarity from the logical viewpoint [Vak91a,Vak91b]. In 
the present paper we base on the formalization given in [Kon97], where, contrary 
to [Vak91a,Vak91b], similarity is treated as a relative notion. More precisely, in 
[Kon97] similarity is defined as a reflexive and symmetric binary relation simp, 
parametrized by the set P of properties with respect to which the objects are 
classified as either similar or dissimilar. Thus, instead of a single similarity re- 
lation we have a whole family {simp)pcpROP, where PROP is the set of all 
the properties considered in a given system. When talking about similarity or 
equivalence it is natural to talk about lower and upper approximation L{simp)A, 
U {simp) A of a given set A of objects with respect to the similarity simp. The 
above operations stem from rough set theory [Paw81], with L{simp)A being 
the set of all objects in A which are not similar (in the sense of simp) to any 
object outside A and U {simp) A - the set of all objects of the universe which 

* This work has been partially supported by the Polish-French Project “Rough-set 
based reasoning with incomplete information: some aspects of mechanization” , 1)7004. 
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are similar to some object in A. Thus, the above operations could be considered 
as the operations of taking ’’interior” and ’’closure” of the set A with respect 
to similarity simp. However, the analogy is not complete, since similarity is not 
transitive, and hence the above operations are not idempotent. 

Practical importance of the approximation operations is quite obvious: if 
we can distinguish objects only up to similarity, then when looking for objects 
belonging to some set A we should take those in L{sim p)A,ii we want to consider 
only the objects sure to belong to A, and those in U {simp) A if our aim is not 
to overlook any object which might possibly belong to A. 

Our objectives. The formal system introduced in [Kon97] features the above 
operations, which generate a family of interdependent relative modalities. The 
resulting polymodal logic is equipped with a complete deduction system. How- 
ever, from the viewpoint of any practical applications of the similarity logic 
in the area of Artificial Intelligence mentioned above an issue of great impor- 
tance is whether the logic is decidable. A positive answer to this question might 
provide not only a decision procedure, but also a better understanding of the 
logical analysis of similarity. These are the objectives of the present paper. Up 
to now, the question of decidability has been open, which is hardly surprising 
in view of the high expressive power of the logic. Indeed: its language admits 
implicitly the universal modal operator, and nominals for atomic propositions 
as well for atomic properties; in addition, the modal operators are interdepen- 
dent. Nominals (or names) are used in numerous non-classical logics with various 
motivations (see e.g. [Orlo84a,PT91,Bla93,Kon97]) and they usually greatly in- 
crease the expressive power of the logics (causing additional difficulties with 
proving (un)decidability -see e.g. [PT91]). Furthermore, since finite submodels 
can be captured in the language up to an isomorphism (which is yet another 
evidence of the expressive power of similarity logics), there is no hope of proving 
decidability by showing a finite model property for a class of models including 
strictly the class of standard models with a bound on the model’s size (see e.g. 
[Vak91c,Bal97]). On the other hand, the intersection operator, which is implic- 
itly present in the interpretation of the modal terms, is known to behave badly 
for filtration-like constructions. 

Our contribution. We prove that the logic defined in [Kon97] together with some 
of its variants is decidable by translating it to a decidable fragment of first- 
order logic: the two-variable fragment FO^ containing equality, but no function 
symbols (see e.g. [Mor75]). Although there are known methods of handling the 
universal modal operator, the Boolean operations for modal terms and nominals 
for atomic propositions in order to translate them into FO^ with equality (see for 
example the survey papers [Ben98,Var97]), the extra features of the similarity 
logics require some significant extra work in order to be also translated to such 
a fragment. This is achieved in the present paper. Unlike the Boolean Modal 
Logic BML [GP90], for which decidability can be proved via the finite model 
property for a class of models, reduction of satisfiability for the similarity logics 
to FO^ with equality is the only known decidability proof we are aware of, and 
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therefore we solve an open problem here. As a side-effect, we prove the finite 
model property. More importantly, the novelty of our approach allows us to 
generalize the translation to a large class of relative modal logics. 

Plan of the paper. The paper is structured as follows. In Section 2 the relative 
similarity logics we deal with in the paper are defined, and some results about 
their expressive power and complexity are stated. In Section 3, we define the 
translation of the main relative similarity logic C into FO^ with equality, and 
show its faithfulness. Decidability and finite model property for C are obtained 
partly by considering the analogous properties of the fragment FO^ with equality. 
In Section 4, we investigate some variants of £, and show their decidability and 
the finite model property. Section 5 concludes the paper by providing some 
generalizations of the results proved in the preceding Sections, and stating what 
is known about the computational complexity of £-satisfiability. In addition, 
several examples of formula translations are given. 

2 Similarity logics 

2.1 Information systems and similarity 

The information systems that proposed for representation of knowledge are the 
foundational structures, on which the semantics of the relative similarity logic 
is based. An information system S is defined as a pair {ENT, PROP) where 
ENT is a non-empty set of entities (also called objects) and PROP is a non- 
empty set of properties (also called attributes) -see e.g. [Paw81]. Each prop- 
erty prop is a mapping ENT — > ViV alprop) \ 0 and Valprop is the set of val- 
ues of the property prop -see e.g. [OP84]. In that setting, two entities ei, 62 
are said to be similar with respect to some set P C PROP of properties (in 
short Cl simp 62 ) iff for any prop € P, prop{ei) Pi prop{e 2 ) ^ 0. The poly- 
modal frames of the relative similarity logics are isomorphic to structures of the 
form {ENT, PROP, {simp) pqprop) ■ Other relationships between entities can 
be found in the literature -see e.g. [FdC084,Orlo84b]. For instance, two entities 
ei, 62 are said to be negatively similar (resp. indiscernible) with respect to some 
set P C PROP of properties (in short Ci nsimp 62 -resp. ei indp 62 ) iff for any 
prop G P, —prop{ei) fl —prop{e 2 ) 0 - resp. prop{e\) = prop{c 2 ). 

The family {simp) pc prop of similarity relations stemming from some infor- 
mation system S = {ENT, PROP) induces certain approximations of subsets 
of entities in S. Indeed, let L{simp)X (resp. U{simp)X) be the lower (resp. 
upper) stmp-approximation of the set X of entities defined as follows: 

- L{simp)X = {e G ENT : V e' G ENT, (e, e') G simp implies e' G X}; 

- U{simp)X = {e G ENT : 3 e' G ENT, (e, e') G simp and e' G X}. 

Obviously L{simp)X C A C U{simp)X and L{simp)X = ENT \ U{simp) 
{ENT \ X). These approximations are rather crucial in rough set theory since 
they allow to classify objects in presence of incomplete information. That is why. 
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the semantics of modal operators in the relative similarity logics shall use these 
approximations as modal operations. We invite the reader to consult [Oe97] for 
examples of rough set analysis of incomplete information. 



2.2 Syntax and semantics 

The set of primitive symbols of the polymodal language L is composed of 

— a set VARSE = {Ei, E2, . . .} of variables representing sets of entities, 

— a set VARE = {xi, X2, . . .} of variables representing individual entities, 

— symbols for the classical connectives A (negation and conjunction), and 

— a countably infinite set { [A] : A G TERM} of unary modal operators where the 
set TERM of terms is the smallest set containing 

• the constant 0 representing the empty set of properties, 

• a countably infinite set VARP = {pj^,P2, ...} of variables representing 
individual properties, 

• a countably infinite set VARSP = {Pi,P2,...| of variables representing 
sets of properties, 

and closed under the Boolean operators n, U, — . 

The formation rules of the set FORM of formulae are those of the classical propo- 
sitional calculus plus the rule: if F G FORM and A G TERM, then [A]F G FORM. We 
use the connectives V, =^>, <tA, (A) as abbreviations with their standard meanings. 
For any syntactic category X and any syntactic object 0, we write X(0) to denote 
the set of those elements of X that occur in 0. Moreover, for any syntactic ob- 
ject 0, we write |0| to denote its length (or size), that is the number of symbol 
occurrences in 0. As usual, sub{F) denotes the set of subformulae of the formula 
F (including F itself). 

Definition 1. A TERH-interpretation u is a map v : TERM — > V{PROP) such 
that PROP is a non-empty set and for any Ai , A2 G TERM, 

— if Ai, A2 G VARP and Ai ^ A2, then v{Ai) ^ v{A 2 ), 

— if Ai G VARP, then v{Ai) is a singleton, i.e. v{Ai) = {prop} for some prop G 
PROP, 

— v{0) = 0, u(Ai n A2) = w(Ai) n u(A2), v{Ai U A2) = u(Ai) U u(A2), 

— v{—Ai) = PROP \ v{Ai). 

For any A, B G TERM, we write A = 0 (resp. A = B) when for any TERM- 
interpretation v, v{A) = 0 (resp. v{A) = u(B)). 

Definition 2. A model U is a. strnctm-e U — {ENT, PROP, (simp) pcprop,u) 
where ENT and PROP are non-empty sets and {simp)pcpROP is a family of 
binary relations over ENT such that 

— for any 0 7^ P C PROP, simp is reflexive and symmetric, 

— for any P, P' C PROP, simpup' = simpOsimp' and sim^ = ENT x ENT. 
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Moreover, is a mapping v : VARE U VARSE U TERM ^ V{ENT) U V{PROP) 
such that u(E) C ENT for any E € VARSE, u(x) = {e}, where e G ENT for any 
X G VARE and the restriction of v to TERM is a TERM-interpretation. 

Since the set of nominals for properties is countably infinite, and any two 
different nominals are interpreted by different properties, each model has an 
infinite set of properties. Let U = {ENT, PROP, {simp)pcpROP,v) be a model. 
As usual, we say that a formula F is satisfied by an entity e G ENT in U (written 
U,e (= F) if the following conditions are satisfied. 

— A/, e ^ X iff {e} = u(x); A/, e ^ E iff e G u(E); 

— A/, e ^ ^F iff not A/, e^F;A/, e^FAGifffV, e^F and U,e\= Q] 

— U,e\= [A]F iff for any e' G U, e' F. 

A formula F is true in a model U (written A/ (= F) iff for any e G ENT, U,e\= 
F - or, equivalently, iff for some e G ENT, U,e\= [0]F. A formula F is said to be 
valid iff F is true in all models. A formula F is said to be satisfiable iff ^F is not 
valid. The similarity logic C is said to have the finite model property iff every satis- 
fiable formula is satisfied in some model U — {ENT, PROP, {simp)pcpROP,y) 
with a finite set ENT such that, for any P C PROP, simp = simpnPg, 
where Pq C PROP is finite and nonempty (Pq is called the relevant part of 
PROP in U). Consequently, if C has the finite model property, then every satis- 
fiable formula has a model {ENT, PROP, {simp) pqprop , v) such that for any 
0 yf P C PROP, simp = Hx^psim^oc}- 

The similarity logic defined in [Kon97] is not exactly the logic C defined 
above, since in [Kon97] the set of properties was supposed to be fixed, and 
constants representing properties were used instead of variables. For any set X, 
we write Cx to denote the logic that differs from C in the following points: ( 1 ) 
the set of properties PROP is fixed in all the models and equals X, (2) VARP and 
X have the same cardinality. In various places in the paper, we implicitly use 
the facts that satisfiability is insensitive to the renaming of any sort of variables, 
and that any two models isomorphic in the standard sense satisfy the same set 
of formulae. Moreover, for the logics Cx, as far as satisfiability is concerned, it is 
irrelevant whether we fix the interpretation of each nominal for the properties. 

2.3 Expressive power and complexity lower bound 

Since the language of the relative similarity logic C contains nominals, the uni- 
versal modal operator and a family of standard modal operators, its expressive 
power is quite high. In Proposition 3 below, we shall state a counterpart of 
Corollary 4.17 in [GG93] (see also Theorem 2.8 in [PT91]) saying that finite 
submodels can be captured in the language up to isomorphism. In Proposition 
3 below, we show that for any finite structure S there is a formula F 5 such that 
a model U satisfies F 5 iff 5 is a substructure of U up to isomorphism. Although 
this shows that the expressive power of the logic is high, it has a very unpleasant 
consequence: there is no hope of characterizing £-satisfiability by a class of finite 
non-standard models the way it is done in [Vak91c,Bal97]. It means for instance 
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that proving the finite model property of £ by a standard filtration-like tech- 
nique becomes highly improbable since C has implicitly the intersection operator 
in the language. 

In Proposition 3 below, the structure S encodes a finite part of some model. 
The set {1, . . . , n} should be understood as a finite set of entities, and {!,...,?} 
as a finite set of properties. Moreover, only a finite set {Ei, . . . , Efe} of atomic 
propositions is taken into account. For i G {1, . . . , n} and j G {1, . . . , fc}, j G v' (i) 
is to mean that E^- is satisfied by i. 

Proposition 3. Let S = 1}, {R{P))pc{i,..j},v') be a struc- 

ture such that each R{P) is a reflexiue and symmetric relation, i?(0) is the 
uniuersal relation, for any P,P' C R{P U P') = R{P) fi R{P') and 

v' is a mapping {!,..., n} ^ P{{1, . . . ,k}) for some k > 1. Then, there is 
formula F 5 such that for any C-model U , U \= Y s iff there is an 1-1 mapping 
iFi : {1, . . . , n} ^ ENT and an injective mapping T 2 : PROP with 

the following properties 

- for any i € {!,. . .,k}, vfE,) = {i£i(s) : i G w'(s)}; 

— for any P C PROP such that there is P' C {!,...,?} verifying {'I' 2 {i) '■ i G 
P'} — P, we have simp — R{P')- 

Proof. The formula F 5 is the conjunction of the following formulae. 

1. [0](xi V...Vx„) 44> [0] Ai<i<j<„^(x* Axj); 

2- [0](A*e{i....,n}(^» ^ where s„ is the empty string if u G 

v{i), otherwise s„ 

3. for any {ii , . . . , fg} C {1, . . . , Z} and alH G {!,..., n}, 

[0](x*^(( f\ (p,^U. . .Up,A^i)^( A ^(PnU- • -Up^A^i)) 

Before establishing decidability of £-satisfiability, one can provide a lower 
bound for the complexity of this problem using [Hem96] . 

Proposition 4. C- satisfiability is EXPTIME-Ziard. 

When no nominals for properties and entities are allowed satisfiability can 
be shown to be in EXPTIME [Dem98]. 

3 Translation from C into FO^ with equality 

3.1 A known decidable fragment of classical logic 

Consistently with the general convention, by FO^ we mean a fragment of first- 
order logic (FOL for short) without equality or function symbols using only 2 
variables (denoted by Jq and in the sequel). We shall translate the similarity 
logics into a slight extension of FO^ obtained by augmenting the language with 
identity. Actually, we shall restrict ourselves to the following vocabulary: 
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— a countable set {P^ : t G w} U {Q^ : i G w} of unary predicate symbols, 

— a countable set {Rij- : i,j G w} of binary predicate symbols, 

— the symbol = (interpreted as identity). 

In what follows, by a first-order formula we mean a formula belonging to just this 
fragment of FOL (written FO^ [=] in the sequel) . As usual, a first-order structure 
A4 (restricted to this fragment) is a pair {D, m) such that D is a non-empty 
set and m is an interpretation function with m(Pi) U m(Qi) C D for i G w, 
w(Rij) C D X D foT i, j G LO and m(=) =' {(o, a) : a G D}. As usual, a valuation 
vm for Af is a mapping vm : {yq, Yi} D. We write M, vm |= F to denote that 
F is satisfied in Ai under vm, and omit vm when F is closed. It is known that 
FO^[=] has the finite model property, FO^[=]-satisfiability is decidable [Mor75] 
and NEXPTIME-complete [Lew80,GKV97]. Actually, F is FO^[=]-satisfiable 
iff F has a model of size 2'^^ 1^1 for some fixed c > 0 [GKV97]. 

3.2 Normal forms 

Let F G FORM be such that^ VARP(F) = {p;^, . . . , p,} and VARSP(F) = {Pi, . . . , P„}. 
In the rest of this section, we assume that n > I and I > 1. The degenerate cases 
make no additional difficulties and they are treated in a separate section. For 
any integer /c G {0, . . . , 2" — 1}, by we denote the term 

Ai n . . . n A„ 

where, for any s G (1, . . .,n}. As = Ps if bits{k) = 0, and As = — Pg otherwise, 
with bits{k) denoting the sth bit in the binary representation of k. For any 
integer k G {0, . . . , 2" — 1}, we denote 

def 

Afe,o = Bfe n -p;^ n . . . n -p, 

Finally, for any (/c, /c') G (0, . . . , 2" — 1} x (1, . . . , 1}, we denote Ak,k' Fp^,/. 
For any TERM-interpretation v : TERM ^ V{PROP), the family 

{u(Afe,fcO : {k, A:') G (0, . . . , 2” - 1} X (0, . . . , 1}} 

is a partition of PROP. Moreover, for any term A G TERM(F), either A = 0 
or there is a unique non-empty set {A^^ , . . . , such that A = Ak^ k[ U 

. . . U Ak^^k'^. The normal form of A, written N{A), is either 0 or A^.^ U . . . U 
Afeu.fct according to the two cases above. Such a decomposition, introduced in 
[Kon97], generalizes with nominals the canonical disjunctive normal form for the 
propositional calculus. N{A) can be computed by an effective procedure. 

For any k' G we write occk' to denote the set 

{/c G {0, . . . , 2" - 1} : 3A G TERM(F), N{A) = . . . U Ak,k' U . . .} 

^ Without any loss of generality we can assume that if I (resp. n) nominals for prop- 
erties (resp. for entities) occur in F they are precisely the I (resp. n) first in the 
enumeration of VARP (resp. VARE) since satisfiability is not sensitive to the renaming 
of variables. 
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Informally, occk' is the set of indices k such that kk,k' occurs in the normal 
form of some element of TERM(F). We write setocck' to denote the set 

{X C ocCk' ■ card{ocCk') — 1 < card{X) < 2" — 1} 

The definition of setoccy is motivated by the fact that for any TERM- 
interpretation v, there is only one k & {0, . . . , 2" — 1} such that v{hk,k') 0, 
and for this very k, v{kk,k') = i'(Pfc')- Fo'' ®^ch X G setoccy in turn, in the 
forthcoming constructions we shall enforce v{kk^y) =0 for any k G X. 



3.3 The translation 

In this section, we define an extension to C of the translation ST defined in 
[Ben83] of modal formulae into a first-order language containing a binary pred- 
icate, a countable set of unary predicate symbols and two individual variables 
(due to a smart recycling of the variables). Our translation of the nominals for 
entities is similar to the translation of nominals in [GG93]. However, we take into 
account the decomposition of terms into components in order to obtain a faithful 
translation. The translation of nominals for atomic properties is a twofold one: 
we take it into account both in defining the normal form of terms, and in the 
generalized disjunction defining the translation T below. 

Let F G FORM be such that VARP(F) = {p^, . . . , p;}, VARSP(F) = {Pi, . . . , P„} 
and VARE(F) = jxi, . . . , Xg}. Before defining ST' - the mapping translating C- 
formulae into FO^-formulae - let us state what are the main features we intend 
that mapping to have. Analogously to ST, ST' encodes the quantification in the 
interpretation of [A] into the language of FO^ by using the standard universal 
quantifier V and by introducing a binary predicate symbol Rj^ for each A G TERM. 
However, this is not exactly the way ST' is defined. Actually, to each component 
kk,y we associate the predicate symbol Rk,y- The main idea of ST' is therefore 
to treat components as constants, which means that the translation of [A]G is 
uniquely determined by the components (if any) of the normal form of A. Then, 
the conditions on the £-models justify why a modal operator indexed by the 
union of components is translated into a formula involving a conjunction of 
atomic formulae. Let ST' be defined as follows {ST' is actually parametrized by 
F and i G {0, 1}): 

( 1 ) ST'{R„y,) = R,{yd-, ST'{x„y,) = Q,{y,)-, 

(2) ST'{^G,y^)^=^^ST'{G,y,); ST'{G AH, y^) ST' {G,y^) A ST' {E,y^); 

(3) 

rVyo^T'(G,yo) if iV(A) = 0 

ST'([A]G,yJ= Vyi_,(Rfe,,fc-(y„yi_J A...ARfe„,fc;(y,,yi_J) ^ 

[ ST'{G, yi_,) if N{k) = kk,^k[ U . . . U Afe„,fc; 

By adopting the standard definition (A)G ^[A]^G, ST' can be easily defined 
for (A)G: the existential quantification is involved instead of universal one. 
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Let Go be a first-order formula (in FO^) expressing the fact that, for any 
{k, A:') e {0, . . . , 2" — 1} X {0, . . . , 1}, Rk,k', is interpreted as a reflexive and sym- 
metric binary relation. Let Gi be a first-order formula expressing the fact that, 
for any f S {1, . . . , g}, is interpreted^ as a singleton, e.g. 

A ^ yo (Q*(yo) yi ^yo=yi ^ ^Q*(yi)) 

i=l 

In the case when VARE(F) = 0, Gi Vyg yg = yg- Let Ti(F) be the first-order 
formula (in FO^ [=] ) defined by 

Ti(F) -GgAGiAdyg 5T'(F,yo) 

The translation is not quite finished yet. Indeed, although at least one of the 
components HPi or fl— Pi is interpreted by the empty set of properties, this 

fact is not taken into account in ST' (considering e.g. n = 1). This is a serious 
gap since at least one of the predicate symbols Rg^ or Ri_i should be interpreted 
as the universal relation. The forthcoming developments provide an answer to 
this technical problem. 

Let G be a first-order formula, k' € {!,...,/} and Xk> G setocck'- We write 
G[k',Xk'] to denote the first-order formula obtained from G by substituting: 

— every occurrence of Rk,k'{zi,Z 2 ) H with H if /c G Xk', 

— every occurrence of F' A Rfe^fc/(zi, Z 2 ) A F" H with F' A F" H if /c G Xk> 
-the degenerate cases are omitted here- 

(this rewriting procedure is confluent and always terminates). ^From a seman- 
tical viewpoint, the substitution is equivalent^ to satisfaction of the condition 
{k G Xk'): Vzi,Z 2 , Rfe_fc'(zi, Z 2 ). For {Xi, . . . ,Xi) G setocci x ... x setocci, 
we write G[Xi , . . . ,Xi] to denote the first-order formula G[l, Xi][2, X 2 ] ■■ - [I, Xi], 
Observe that for any permutation cr on {1, ...,?}, 

G[a(l), X,,(i)] [u(2), X,(2)] . . . [a{l), = G[l, Xi] [2, X 2 ] . . . [I, Xi] 

Let T(F) be the formula 

T(F) = Y{Ti(F)[Xi,...,W] : {Xi,...,Xi) G setocci x ... x setocci} 

Observe that T is exponential-time in |F| and the size of the formula obtained 
by translation may increase exponentially. It is however, not clear whether there 

^ Let FO^[3^'^] be FO^ augmented with the existential quantifier meaning ’’there 
exists exactly one”. FO^[3^'^]-satisfiability has been proved to be in NEXPTIME 
(see e.g. [PST97]). By defining Gi by Gi Ai=i 70 to prove 

decidability of T-satisfiability via a translation into FO^[3^'^]. 

® Another solution consists in defining G[k',Xk'] as the formula 
(AfceXj,, ^yoOi, Rfc.fc'(yo,yi)) AG. 
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exists a tighter translation that characterizes more accurately the complexity 
class of £-satisfiability. Observe also that T(F) is classically equivalent to 

Go A Gi A 3 yg Y{ST'(F,yg)[Xi, . . . , X;] : (Xi, . . . , X;) G setocci x . . . x setocci} 



Example 5. Let F be the formula (p;^ U P2)^Ei A [Pi]Ei. Then F is £-satisfiable, 
and the translation of F is the disjunction of the following formulae: 

1. GoAGi A 3 yg ( 3 y^ Ei.i(yo, Yi) A Ri.2(yo, Yi) A -Pi(yi)) A (Vy^ Ro.o(Yo,yi) ^ 

Pi(yi)) 

2. Go A Gi A 3yg (3yi Ro,i(yo,yi) A Ri.2(yo,Yi) A -Pi(yi)) A (Vy^ Ro.o(yo,yi) A 
Roa(yo.yi) Pi(yi)) 

3 . Go A Gi A 3 yg ( 3 yi Ri,i(yo,yi) A Ro.2(Yo,yi) A -Pi(yi)) A (Vy^ Ro.o(yo,yi) A 
Ro.2(Yo,yi) Pi(yi)) 

4 . GoAGi A 3 yg ( 3 y^ Ro.i(Yo. Yi) A Ro.2(Yo, Yi) A -Pi(yi)) A (Vy^ Ro.o(yo.Yi) A 
Roa(yo.Yi) ARo.2(yo,Yi) Pi(yi)) 

The translation takes into account that fV(p^ U P2) = Aoq U Ai_i U Ao,2 U Ai_2 and 
iV(Pi) = Ao^o U A04 U Ao^2- 

3.4 Faithfulness of the translation 

The rest of this section is devoted to proving Proposition 6 below and stating 
certain corollaries (some being consequences of the proof of Proposition 6). 

Proposition 6. (1) F is C-satisfiable iff (2) T(F) is first-order satisfiable. 

Proof. (1) implies (2). First assume U,eo |= F for some model 

14 = [ENT, PROP, {sivnp)p<zpROP iV) 

and Co G ENT (this is the easier part of the proof). Let us define the following 
first-order structure M "= (D,m): 

— D = ENT] for any i G to, m{Qi) = v{xi) and m(Pi) u(Ei); 

— for any {k,k') G { 0 , . . . , 2” — 1} x 1 }, m{Rk,k') = (for the 

other values of {k, k') the interpretation of Rk,k' is not constrained). 

Let (ii,...,ii) G {0,...,2" — 1}* be such that for any k' G 
^'(Ai^,,,fc') = u(pj.,). Such a sequence {i\,...,ii) is unique. So, for any k' G 
1}, Xk' = occk' \ {ik'}- It is easy to show that M ^ Gq A Gi since U 
is a model. We claim that M ^ Ti(F)[Xi, . . . , Xi], and therefore M ^ T(F). To 
prove such a result, let us show that for any G G sub{F), e G ENT, i G {0, 1}, 
U,e ^ G iff M,VM[y^ ^ e] )= ST'(G, y,)[Xi, . . . , W]. We write u>i[y, ^ e] 
to denote a first-order valuation vm such that VMili) ~ H entails M. |= 
3yg ST'(G, yg)[Xi, . . . , Xi\. We omit the base case and the cases in the induction 
step when the outermost connective is Boolean. Here are the remaining cases. 
Case 1: G — [A]Fi and iV(A) = 0 
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U,e\= [A]Fi iff for any e' G U, e' ^ Fi 

iff for any e' G ENT, U,e' ^ Fi 

iff for any e' G D, M, VMly^ ^ e'J ^ 5T'(Fi, yJ[Xi, . . . , X;] 
iff Af hVyo ^T'(Fi,yo)[Xi,...,X,] 
iff Af h^r'([A]Fi,y,)[Ai,...,X,] 

Case G — [A]Fi and N(A) = U . . . U 

First observe that for any fc' G and k G Xk', v{Ak,k') = 0- 

U,e\= [A]Fi iff for any e' G n*e{i,...,«} ^ 

iff for any e' G n*e{i,...,„} s*"^v(A,^,,,)(e), 

iff for any e' G fliefi ,«} «^(Rfei,fc')(e), 

M, VMlYi-,^e']\=Sr{F,,y,_^)[X,,...,Xi] 
iffX, VM[Y^^e\ hVyi_, (Rfei.fci(y*, Yi-*) A . . . A Rfe„,fc;(y., y^.J) 
^5T'(Fi,y,_,)[Ai,...,A,] 

iff M, VM[Y^ ^ e] h Vyi_, (Ai<*<„, yi-*)) 

5T'(Fi,yi_,)[Ai,...,A,] 

(Ai<*<„, Rfei.fcAy*>yi-*) T R the conjunction is empty) 

iffX, uxi[y,^e] h (Vyi_* (R.fei,fcAy*,yi_A A...ARfc„,fcAy*-yi-*)) 

^^T'(Fi,yi_,))[Ai,...,A,] 
iffX h^r'([A]Fi,y,)[Ai,...,A,] 



In the previous line the substitution operation is performed only on 5T'(Fi, y;^_j) 
whereas in the next line it is performed on the whole expression. 

(2) implies (1). Omitted because of lack of space. 



Corollary 7. (1) The C- satisfiability problem is decidable. (2) C has the finite 
model property. In particular, every L-satisfiable formulaF has a model such that 
card(ENT) 

< '' for some fixed polynomial p{n), and the cardinality of the relevant 

part ofU is at most 2" + I, where n = card(VARSP(F)) and I = card(VARP(F)). 

As observed by one referee, formalizing concepts from similarity theory di- 
rectly in first-order logic could be another alternative. 



3.5 The degenerate cases 

In the previous section we have assumed that n > 1 and I > 1. Now let us 
examine the remaining cases. If ? > 1 and no variable for sets of properties 
occurs in the formula (n = 0), then we consider the following components: 
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A() = — Pi n . . . n — p; and Ak> = p^/ for k' G {1, . . . , 1}. Condition (3) in the 
definition of ST' becomes: 

rVyo ^T'(G,yo) if fV(A) = o 

ST'([A]G,yJ = V yi_,(Ro.fci(y*,yi-i) A...ARo,fc;(y„yi_,)) => ST'{G,y^_^) 

[ if iV(A) = Afe. U . . . U Afe; 



If n > 1 and no variable for individual properties occurs in the formula 
(1 = 0), then Condition (3) in the definition of ST' becomes: 

rVyo ^T'(G,yo) if 1 V(a) ^ 0 

ST'([A]G,yJ = V yi_,(R.fei.o(y„yi_J A . . . ARfe„,o(y*,yi_J) ^ ST'(G,yi_J 
[ if iV(A) = Bfej U . . . U (see Section 3.2) 

Moreover, T(F) is simply defined as Ti(F). In the case when n — 0, I = 0 and 
TERM(F) ^ 0, by substituting every occurrence of 0 in F by p^ n — p^ we preserve 
£-satisfiability and reduce the case to the previous one. Otherwise, F is a formula 
of the propositional calculus and therefore it poses no difficulty with respect to 
decidability. 

4 Decidability results for variants of C 

4.1 Fixed finite set of properties 

In this section we consider a finite set PROP of properties, and show that £ pro p 
shares various features with £. Actually CpROP corresponds to the similarity 
logic with a fixed finite set of properties defined in [Kon97]. Without any loss 
of generality we can assume that PROP = {!,..., a} for some a > 1 and 
VARP={pi,...,p„}. 

Let F be a £pKOP-fo™ula such that VARP(F) = {p-^, . . . , p-^} and VARSP(F) = 
{Pjj, . . . , Pj„}. For any interpretation v possibly occurring in some £ppop-model 
and for any A G TERM(F), if v{A) = {k\, . . . , kg}, then Ny{A) = p^^ U . . . U p^^ 

otherwise (w(A) = 0) W(A) “= 0. We write up (resp. W(F)) to denote the 
restriction of v to TERM(F) (resp. the formula obtained from F by substituting 
every occurrence of A by Ny{A)). Let Xp be the finite set 

{up : V interpretation possibly occurring in some CpROP —model} 



Proposition 8. Let F be a Cprop -formula. (1) F is Cppop-satisfiable iff (2) 
\Jv"&x XyifF) is £-satisfiable. 

Proof. (1) implies (2): Assume U — {ENT, PROP, {simp)pcpROP,v),eo \= F 
for some cq G ENT. It is easy to check that W ,cq ^ W(F) where W is defined 
from U by only replacing v by v' defined as follows: for any i G {!,..., a}, 
u'(Pj) {i}. Let U" = {ENT, CO, {sim'fi)pcuj,v") be an £-model such that 
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— v' and v" are identical for the common sublanguage, 

— for any i > a, = {t}, for any P C cv, sim'p simpppROP ■ 

It is a routine task to check that U",eo ^ Ny{Y) and therefore W',eo H 
Vt,"exp Indeed, for any A e TERM(F), stTO„(A) = 

(2) implies (1): Now assume Vt;"6Xp ^v"{'P) is £-satisfiable. There exist an 

£-model U' = {ENT' , PROP' ,{sim'p)pQPROP' ,v'), cq € ENT' and v'q G Xp 
such that A/', Co ^ iV„»(F). By the proof of Proposition 6, we can assume that 
{wi, . . . , Ua} is a relevant part of PROP' in W such that for any t G {1, . . . , a}, 
v'{p^) = Ui- Indeed, PROP' is at least countable, VARSP(V^„g_Y^ A^„//(F)) = 0 

and card(VARP(Y^„g_Y Ny»{F))) < a. 

F 

Let U — {ENT', PROP, {simp)pcpROP,v) be the Tp/jop-niodel such that: 

— V and Vq are identical for the common sublanguage; 

— for any P C PROP, simp =' • 

It is a routine task to check that U,eo |= F since for any A G TERM(F), = 

0 



Corollary 9. C prop -satisfiability is decidable and CpROP has the finite model 
property. 



Ea^ample 1 0. ^ E xample 5 continued^ Let F be the formula U P 2 ^ /\ [Pi]Ei 

for the logic C{i,2}- Then, F is not £{i_ 2 }-satisfiable, although F is £-satisfiable. 
The formula \/,,,^x -^t>'(F) Is the disjunction of the following formulae: 

1- ((Pi U P2 )-Ei a [0]Ei) V ((Pi U P2 )-Ei a [pjEi) 

2- ((Pi U P2 )-Ei a [p2]Ei) V ((pi U P2 )-Ei A [p^ U P2]Ei) 

4.2 Fixed infinite set of properties 

In this section we consider some infinite set PROP of properties, and show 
that CpROP shares various features with C. Actually, Lprop corresponds to 
the similarity logic with a fixed infinite set of properties defined in [Kon97]. 
Without any loss of generality we can assume that u> C PROP (there is an 
injective map / from u> into PROP) and {pj^, P 2 , • • .} C VARP. 

Proposition 11. Let F be a Cprop -formula. (1) F is CpROp-satisfiable iff (2) 
F is L- satis fiable. 

Proof. Omitted because of lack of space. 



Corollary 12. Cprop - satisfiability is decidable and Cprop has the finite model 
property. 
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5 Concluding remarks 

We have shown that the relative similarity logics £ and Lx for some non-empty 
set X of properties have a decidable satisfiability problems. Moreover, we have 
also established that such logics have the finite model property. The decidability 
proof reduces satisfiability in our logic to satisfiability in FO^ [=] , a decidable frag- 
ment of classical logic [Mor75]. Although our reduction takes advantage of the 
standard translation ST [Ben83] of modal logic into classical logic, the novelty 
of our approach consists in the method of handling nominals for atomic proper- 
ties and decomposition in terms of components encoded in the translation. The 
reduction into FO^ [=] can be generalized to any relative logics provided, 

1. the conditions^ on the relations of the models can be expressed by a first- 
order formula involving at most two variables (see the definition of the for- 
mula Go in Section 3.3), and 

2. the class of binary relations underlying the logic is closed under intersection. 

For instance, if in the definition of L we replace reflexivity by weak reflexivity, 
then decidability and finite model property still hold true®. This is particularly 
interesting since weakly reflexive and symmetric modal frames represent exactly 
the negative similarity relations in information systems (see e.g. [Vak91a,D096]). 
For the sake of comparison, the class of reflexive and symmetric modal frames 
represent precisely the positive similarity relation in information systems. 

We have also shown that £-satisfiability is EXPTIME-hard (by taking ad- 
vantage of the general results from [Hem96]), and that the problem can be solved 

2P(n) 

by a deterministic Turing machine in time 0(2^ ) for some polynomial p(n), 

where n is the length of the tested formula. Indeed: the translation process T is 
exponential in time in the length of the formula, T may increase exponentially 
the length of the formula and satisfiability for F0^[=] is in NEXPTIME. It is 
therefore an open problem to characterize more accurately the complexity class 
of £-satisfiability. However, the translations we have established can already be 
used to mechanize the relative similarity logics by taking F0^[=] as the target 
logic and by using a theorem prover dedicated to it. We conjecture that more 
efficient methods might exist for the mechanization. 
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Abstract. In this paper we introduce a conditional logic BC to rep- 
resent belief revision. Logic BC has a standard semantics in terms of 
possible worlds structures with a selection function and has strong simi- 
larities with Stalnaker’s logic C2. Moreover, Gardenfors’ Triviality Result 
does not apply to BC. We provide a representation result, which shows 
that each belief revision system corresponds to a BC-model and every 
BC model satisfying the covering condition determines a belief revision 
system. 



1 Introduction 

One of the most important ideas in the analysis of conditional propositions comes 
from a proposal of F.P.Ramsey who proposed in [15] that in order to decide 
whether to accept a conditional proposition A > B (whose meaning is: “if A 
were true then B would be true” ) we should add the antecedent A to our belief 
set, changing it as little as possible, and then consider whether the consequent 
B follows. Stalnaker’s logic [16] stems from this intuition, even if Stalnaker was 
interested in analyzing the truth conditions of conditional propositions, whereas 
Ramsey maintained that conditional propositions do not have any truth value. 

The acceptability criterion proposed by Ramsey has received a renewed inter- 
est ten years ago by Gardenfors [3,5], who developed together with Alchourrdn 
and Makinson [1] a theory of epistemic change. In [3] it is proposed the following 
version of Ramsey’s acceptance criterion: 

Ramsey Test: A>BgK\SBgK* A, 

where K represents a belief set (that is, a deductively closed set of sentences) and 
* represents a Belief Revision operator. The operator * transforms ( “revises” ) a 
belief set K by adding a formula A in such a way that the resulting belief set, 
denoted by K*A, is consistent if so is A] moreover, K*A is obtained by minimally 
changing K. Gardenfors, Alchourrdn and Makinson have expressed this minimal 
change requirement and other natural conditions on revision operators by a set 
of rationality postulates (called the AGM postulates) that we will recall in the 
next section. 

In spite of the similarities between the semantics of belief revision and the 
evaluation of conditionals, the above very intuitive acceptance principle leads 
to the well known Triviality Result by Gardenfors, [3] , which claims that there 
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are no significant belief revision systems that are compatible with the Ram- 
sey test. The core of Gardenfors’ Triviality Result is the inconsistency between 
the Preservation Principle, which governs belief revision, and the Monotonic- 
ity Principle, which follows from the Ramsey test. By the former principle it 
is intended that, if a formula B is accepted in a given belief set K and A is 
consistent with the beliefs in K, then B is still accepted in the minimal change 
of K needed to accept A. By the latter principle it is intended that K C K' 
implies K * A C K' * A. 

Since Gardenfors’ Triviality Result appeared, many authors have proposed 
a solution to the problem. Several authors have studied alternative notions of 
belief change, like “belief update” [7,8,6], which do not enforce the Preservation 
Principle. 

Rather than considering alternative notions of belief change, we follow an- 
other line of research (see [10,8,12,2]) which has focused on the problem of cap- 
turing belief revision itself within a conditional system in a less stringent way 
than the original formulation of RT. The dependency between conditionals and 
belief sets should be less strict, in the sense that if we accept a conditional propo- 
sition with respect to a belief set K, this does not entail that we are willing to 
accept it with respect to every larger belief set. In this sense, the acceptability 
of conditionals is nonmonotonic in K. 

We adopt a weaker formulation of the Ramsey test. Namely: 

(RT) A > B is “accepted” in K iS B G K * A, 

where the notion of “acceptability” of a conditional A > B in K is, following 
the spirit of Levi’s proposal, a weaker condition than ‘^A > B G K^’ . In this 
work, we interpret the acceptability of a conditional A > B as: A > B is true in 
a world which satisfies the conditional theory TIik associated to each belief set 
K. We get nonmonotonicity in the sense stated above from the fact that from 
K C K' it does not follow that TIik C Th'j^. 

A semantical reconstruction of revision in terms of conditional logic has to 
depart from the standard semantics of conditionals. This aspect has been pointed 
out in [12] and in [2]. Semantically, to each belief set K it is associated the set of 
its models (or worlds) [[K]] and a preorder relation <k- To AT* A it is associated 
the set of models closest to [[K]] with respect to <k which satisfy A. This global 
dependency on K (because of <k) is the basic semantical difference with the case 
of the above mentioned update operator and is also the main difficulty to give a 
semantical reconstruction of revision in terms of conditional logics. One possible 
solution, pursued by Friedman and Halpern in [2] is to introduce another level of 
semantic objects, called epistemic states, to account for this dependency; each 
K corresponds to an epistemic state and each epistemic state has associated a 
set of worlds. In our approach, we do not introduce any further semantic object 
to account for this global dependency on belief sets; as we will see, each world 
carries with itself, so to say, the information about what is believed in it. 

In this paper, we define a conditional logic which formalizes the idea of veri- 
fying acceptability of conditionals with respect to a belief set. However, we want 
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to stay as close as possible to standard conditional logic and to define the truth 
conditions of formulas relatively to worlds, rather than to belief states, no matter 
how they are represented. To this purpose we need a belief operator which allows 
each world to be associated with a set of beliefs. As we will see, it is possible 
to define such a belief operator through the conditional implication itself. The 
resulting logical system is a conditional logic which allows one to represent belief 
sets and their revision. We will also see that our logic has strong connections 
with Stalnaker’s logic. 

In Section 2 we briefly recall the AGM postulates and the triviality result. 
In section 3, we introduce our conditional logic, and in section 4 we provide 
a representation theorem which maps belief revision systems and conditional 
models. Finally, section 5 concludes the paper and provides comparisons with 
other proposals. 

2 Belief revision 

We have mentioned in the previous section that Ramsey’s criterion of accept- 
ability for conditionals is to add the antecedent of the conditional to the belief 
set, changing the belief set as little as possible, and see whether the consequent 
belongs to the belief set so changed. But Ramsey did not worry about defining 
any operation to change belief sets. 

In [1,5,4] two operations on belief sets are introduced, namely expansion and 
revision. Let Cn{A) denote the deductive closure of A in classical propositional 
logic. We define a belief set R as a deductively closed set of propositional for- 
mulas, that is, K = Cn{K). Expansion is the simple addition of a formula A 
to a belief set K, and it is defined by: K + A = Cn{K U {A}). Revision is the 
consistent addition of a formula A to a belief set K, denoted by R * A. 

Alchourron, Gardenfors and Makinson in [1] have proposed some rationality 
postulates that any belief change operator must satisfy. AGM postulates enforce 
the Preservation Principle. [5]: 

Revision postulates 

(AT) {K * A) is a belief set; 

(AT2) AeK*A; 

(AT3) {K*A)C (K + A); 

(R4) li^A^K,K + ACK*A; 

{Kb) (AT * A) = A'_l only if h ^A; 

(AT6) if h A ^ R then K*A = K*B- 

(AT7) K *{AaB) C{K *A) + B-, 

(ATS) if ^B ^{K* A), then {K * A) + B C K * {A A B) 

Postulate (K2) says that revision is always successful; postulate (K3) says that 
the revision of a belief set with a formula A does not lead to conclude more than 
what can be concluded by the simple expansion of K with A; {KA) is equivalent 
to the Preservation Principle and says that when we make the revision of K 
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with a formula consistent with K, no information of K has to be rejected. Taken 
together, {Ki) and {KA) say that, if A is consistent with K, then a revision of K 
with A is just an expansion of K with A. Postulate (IC5) says that the revision is 
consistent unless the added formula is inconsistent by itself. Postulate (AT6) says 
that the result of revision does not depend on the syntactic form of the added 
information. Postulates (K7) and (K8) can be regarded as a generalization of 
(K3) and {KA) to deal with conjunctions. 

In [3], Gardenfors wonders whether Ramsey’s proposal can be formalized 
using the notion of revision operator as determined by his postulates. As we have 
already said, the above postulates are not compatible with the Ramsey test. To 
see this fact, let us define a belief revision system as a pair < K, * >, where * 
is a revision operator and K is a set of belief sets closed under *. Moreover, we 
say that a belief revision system is non-trivial if there are at least three disjoint 

propositions A, B, C, (such that I '{A A B) and I '{A A C) and I '{B A C)) 

and a belief set iC € K such that K is consistent with A, B, C. We can roughly 
say that a trivial belief revision system contains only complete belief sets. Such 
belief revision systems represent a sort of degenerate case. 

The Triviality result claims that there are no non-trivial belief revision sys- 
tems that satisfy the Ramsey test. As we have already seen, the problem is that 
a direct consequence of the Ramsey test, the Monotonicity Principle, is incon- 
sistent with the Preservation Principle. First of all, it is easily shown that the 
Monotonicity Principle follows directly from the Ramsey test: if i? € AT * A 
then by the first half of the Ramsey test A > B G K. But if AT C ATi, then 
A > B G Ki and, by the second half of the Ramsey test B G ATi * A. Therefore 
we can conclude that for each AT, ATi such that if A' C A'l , AT* A C A'l * A. On the 
other hand the Preservation Principle makes the revision operator nonmonotonic 
in the sense that, given two belief sets AT and ATi such that AT C ATi, we cannot 
conclude that K*AC Ki*A. As an example, let us consider three belief sets AT, 
ATi, K 2 such that K is the deductive closure of ATi U K 2 and such that, for some 
formula A, Ki+A is consistent, K 2 +A is consistent but A'-f A is not consistent. 
From (A'4) we can conclude that ATi -|- A C A'l * A and K 2 + A C K 2 * A. From 
the Monotonicity Principle we should conclude that ATi * A C K * A and hence 
that All -I- A C AT * A and we should similarly conclude that K 2 -I- A C AT * A. 
Therefore (ATi U K 2 ) -I- A C AT * A and therefore K * A would be inconsistent, 
which contradicts the revision postulate (K5). 



3 The Conditional Logic BC 

Definition 1. The language £> of logic BC is an extension of the language C 
of classical propositional logic obtained by adding the conditional operator >. 
Let us define the following modalities: 



□ A = ^A > T 
OA = -.(A > T). 
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We call a modal formula a formula A of the form Oi • • • O™ B, where m > 0 
and for i = 1, . . . , m, 0» is either □ or O, and B G C. The logic BC contains 
the following axioms and inference rules: 

(CLASS) All classical axioms and inference rules; 

(ID) A > A; 

(RCEA) HL A ^ B, then L (A> C) ^ (B > C); 

(RCK) if h A then L (C> A) ^ (C> B); 

(DT) (AaC) > B) ^ (A> (C ^ B)), for A,B,C e £; 

(CV) -(A > -^C) A(A> B)^ ((A AC)> B), for A,B,Cg C; 

(REFL) (T > A) ^ A; 

(EUC) ^(A> B)~^A> -(T > B); 

(TRANS) (A> B) ^ A> (T > B); 

(BEL) (A> B)-^T >(A> B); 

(MOD) UA~=>B>A, where A is a modal formula; 

(UA) DA ^ nnA, where A is a modal formula; 

(US) OA nOA, where A is a modal formula. 

Note that a conditional formula T > A can be regarded as a belief operator 
meaning that “A is believed”. Moreover, Axioms (REFL), (EUC) and (TRANS) 
(the last two ones for A = T) make the belief operator an S5 modality. 

Our logic has strong similarities with Stalnaker’s logic. First, from (DT) and 
(REFL) we can derive (MP)((A > B) ^ (A ^ B)) restricted to the case in 
which A,BgC. Moreover, if we assume the axiom A ^ (T > A), from (CV) 
we derive (CS)( (A A B) (A > B)) again restricted to the case in which 
A, i? G £, from (EUC) we derive the usual (CEM) ((A > B) V (A > ^B)) 
and axioms (TRANS) and (BEL) become tautological. Note that all axioms 
(ID), (CV), (MOD), (MP), (CS) and (CEM) belong to the axiomatization of 
Stalnaker’s logic C2 (see [14]). 

Our logic intends to model belief revision, in the sense of the representation 
theorem given in section 4. Before establishing a correspondence between the 
axioms of our logic and Gardenfors’ postulates, let us describe the model theory 
of our logic. 

We develop a semantical interpretation for the logic BC in the style of stan- 
dard Kripke-like semantics for conditional logics. Our structures are possible 
world structures equipped with a selection function. 

Definition 2. A BC-structure M has the form (W, /, [[]]), where W is a set of 
possible worlds, / : £> x 2^ is a selection function, [[]] : £> P(W) is a 
valuation function satisfying the following conditions: 

(1) [[AAB]] = [[A]]n[[B]] 

(2) [hA]] = W-[[A]] 

(3) [[A^B]] = (W-[[A]])U[[B]] 

(4) [[A > B]] = {tc : f(A,w) C [[B]]}; 
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Moreover, let Prop{S) = {A G £: G Ssuchthatw G [[^]]} We assume 

that the selection function / satisfies the following properties: 

{ID) f{A,w)C[[A]]; 

(RCEA) if [[A]] = [[B]] then f{A,w) = f{B,w) 

\dT) Prop{f{A, w) n [[C]]) C Prop{f{A A C, te)), for A,C € C; 

\cv) f{A,w) n [[C]] ^ 0 Prop{f{A A C,w)) C Prop{f{A,w) n [[C]]), for 
A,Ce£; 

{REFL) w G f{T,w)-, 

{TRANS) X G f{A, w) Ay € f{T, x) ^ y € f{A, w); 

{EUC) x,yef{A,w)^xef{T,y) 

{BEL) w G f{T,y) ^ f{A,w) = f{A,y) 

{MOD) If f{B,w) n [[A]] ^ 0, then f{A,w) ^ 0 where A is a modal formula 

{UN IV) if [[A]] ^ 0, 3B such that f{B,w) n [[A]] ^ 0, where A is a modal 
formula. 

We say that a formula A is true in a BC-structure M = {W, f, [[]]) if [[A]] = W. 
We say that a formula is BC-valid if it is true in every BC-structure. We also 
introduce the following notation S \=m A to say that, given a BC-structure M, 
a set of formulas S and a formula A, for all te G M if te G [[i?]] for all B G S, 
then w G [[^]]. 

Note that, in a given BC-structure M, we can define through the selection 
function / an equivalence relation R on the set of worlds W as follows: for all 
w, w' G W, 

{w, w') G i? iff w' G /(T, w). 

The properties of R being reffexive, transitive and euclidean come from the 
semantic conditions (REFL), (TRANS) and (EUC) on the selection function / 
(and, more precisely, from the last two conditions by taking A = T). Hence, we 
can read T > A as “ A is believed”, in contrast to the meaning of A which is 
“A is true” . 

The accessibility relation R determines equivalence classes among worlds. 
The intuition is that we can associate to each world a belief set which is the set 
of formulas true in all worlds in its equivalence class. As a consequence of axiom 
(BEL), evaluating a conditional formula A > i? in a world is exactly the same as 
evaluating that formula in a different world in the same class. This means that 
the value of A > B in one world only depends on the beliefs true in that world 
and not on the objective facts true in that world. 

When a conditional A > B is evaluated in a world w, the selection function 
selects the set /(A, w) of the most preferred A-worlds with respect to w, and B 
is evaluated in such a set of worlds. Axioms (EUC) and (TRANS) make /(A, w) 
an equivalence class to which we can associate a belief set. Notice that, since 
(A > C) V -i(A > C) is a tautology, from (EUC) and (TRANS) we can conclude 
(A > (T > C)) V (A > ->(T > C)), that is, C is either believed or non believed 
in the most preferred A-worlds. This is the conditional excluded middle, (CEM), 
restricted to belief formulas. While the presence of (CEM) in Stalnaker’s logic 
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causes the selection function to select a single world (i.e. f{A, w) = {j} for 
all A and w), when (GEM) is restricted to belief formulas (as in our logic), it 
determines the unicity of the belief set associated to all worlds belonging to 
f{A, w). When we evaluate a conditional, we want to move from one world, with 
an associated belief set, to other worlds with a different belief set. This ability to 
explicitly represent the new belief set obtained from the initial one is especially 
important if we want to model iterated revision by nested conditionals. We will 
come back to this point in section 4. 

The restrictions we have imposed on our axioms are motivated by the fact 
that our logic is intended to model belief revision. Concerning (CV) and (DT), 
we observe that the revision postulates do not deal with conditional formulas. 
In particular, the Preservation Principle says something only about the classical 
formulas belonging to the revised belief set, whereas, as we have seen in the 
Introduction, the conditional formulas accepted in the original belief set might 
no longer be accepted in the revised one, since they globally depend on the whole 
belief set. 

From the semantical condition (UNIV), which corresponds to (U4) and (U5), 
and from (MOD) we get the property: 

if [[A\\^%, then f{A,w)^%, 

which, as we will see, is needed to model the revision postulate (K5). The restric- 
tions we have put on (MOD), (U4), (U5) are needed since we cannot accept that 
the above property holds for all formulas A G L>. Having this property for ar- 
bitrary A would correspond to being able to reach any belief set from any other. 
This cannot be done by means of the revision operator. In general, given two 
belief sets Ki and K 2 , there may not exist a formula A such that K 2 = Ki * A 
(for instance when B G K\ but neither B G K 2 nor ^B G ^" 2 )- 

The axiomatization of BC is sound and complete with respect to semantic 
introduced above. 

In the following, for readability, we use the notation x \= A rather than 

[[^ 11 - 

Theorem 3 (Soundness). If a formula A is a theorem of BC then is BC-valid. 

Proof. (Sketch) One checks each axiom and then shows that rules (RCEA) and 
(RCK) preserve validity. As an example, we give a proof of the validity of (CV) 
and (U5). Let M = {W, f, [[H^) be a BC structure. For (CV), lei x G W and 
let A, B,C G C, X \= -'(A > -^C) and x \= A> B. Let y G f{A A C, x), we must 
show that y \= B. Let Atom{B) be the set of propositional variables occurring 
in B and let 

i’B.y = /\{p e Atom{B) \ y^p\ ^ f\{^p G Atom{B) | y ^ p}. 

Since B G £, we clearly have y \= B IS tpB,y — > R is a classical tautology. 
Moreover, tpB,y G Prop{f{A A C,x)). By hypothesis, x ^ ^{A > -iC) and 
this implies that f{A,x) O [[C]] ^ 0. By condition (CV) we can conclude that 
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ipB,y G Prop{f{A, x) n [[C]]), thus for some z G f{A, x) fl [[C]] we have z \= tpB,y 
and z B, for z G f{A,x) and x \= A > B. Let tpB,z be defined in a similar 
way to ^pB,y■ Since ^ \= i^B,y, it must be that tpB,z <-*■ fpB,y holds; on the other 
hand from ^ ^ B, we have ipB,z — > i? is a classical tautology. We can conclude 
that tpB,y B, whence y \= B. 

For (U5) let a: G W and let (1) x ^ OA, where A is a modal formula. Suppose 
that (2) X ^ nOA. By (1) we have f{A, x) (we recall that OA = ~^{A > _L)), 
whence [[A]] 0. Observe that DOA = (A > _L) > _L, thus by (2) x {A > 

±) > ± and there is .2 G f{A > A,x). We have that z \= A > ±, i.e. f{A, z) = 0, 
whence [[A]] = 0 by (MOD) and (UNIV). We have a contradiction. 

□ 

Theorem 4 (Completeness). If A is BC-valid then it is a theorem of BC. 

Proof. (Sketch) By contraposition, we show that if \/ A then there is a BC 
structure M in which A is not true. Let us fix the language £>. As usual we can 
prove that if \/ A, then there is a maximal consistent set of formulas Xq which 
does not contain A. We assume that the usual properties of maximal consistent 
sets are known (e.g. if X is maximally consistent, then D G X or —•D G X). We 
define M = {W, f,[[]]^), as follows 

W = {X C £> j X 1/ A and X is maximally consistent}, 
f{B,X) = {Y GW I Sb,xQY}, 
where Sb x = {C G C> \ B > C G X}; 

= {x GW\pGX}. 

One can prove the following facts. 

Fact 1 for every formula B G £> and X G W, i? G JA iff X G [[.B]]^- 
Fact 2 the structure M satisfies all conditions of definition 2, except (pos- 
sibly) the condition (UNIV), namely, (ID), (RCEA), (DT), (CV), (REFL), 
(TRANS), (EUC),(BEL), and (MOD). As an example, we prove condition (CV) 
and (BEL) the other are similar and left to the reader. For (CV), let 

f{D, X) n [[C]] 0 and G Prop{f{D A C, A)), 

where D,C,Tp G C. Suppose that Tp ^ Prop{f{D,X) n [[C]]), then for all U G 
f{D,X) n [[C]], we have G U; From the fact that if C ^ U, then -iC G U 
and the fact that the [/’s are deductively closed, we easily have that for all 
U G f{D,X), C gU, this implies that D > {C ^ G X. By (CV), 

we get that 

(*) {D^C)> G X. 

Since -ip G Prop{f{D A C,X)), there exists Z G f{D A C,X), such that xp G Z, 
on the other hand by (*), C — > ~^ip G Z and since C G Z, we get -^ip G Z, we 
have a contradiction. 

For (BEL) let A G /(T, Y), we show that /(D, A) = f{D, Y). By definition 
of /, it suffices to show that Sd,x = Sd,y- If B G Sd,y then D > B G Y, whence 
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T > (D > B) G r by (BEL). Thus D > B G X and B G Sd,x- Conversely, if 
B ^ Sd,y, then D > B ^ Y,T > (D > B) ^ Y hy (REEL). This implies that 
-(T > [d > B)) G Y, whence T > ^{T > {D > B)) G F by (EUC). Thus, 
-■(T > {D > B)) G X and also —<{0 > R) G JV by (BEL), whence D > B ^ X, 
i.e. B ^ Sd,x- 

The structure M does not necessarily satisfy the condition (UNIV) . Our plan 
is to define a substructure Mq of M which is still a BC-structure, falsifies A and 
satisfies the universality condition. In order to define Mq, let Xq G W such that 
A ^ Xq (whence —^A G Xq). We define a binary relation on W, for X,Y G W, 
let 



RXY = \/D modal formula {UD G X ^ D gY) and then we let 
Wo = {F G W I RXoY). 

We first show that Wo is closed with respect to /,i.e. 

(i) if Z G Wo and F G f{B, Z), then F G Wo, 

(ii) for all Y,ZgWo, RYZ holds. 

For (i) let OD G X, then □□£) G fV by (U4), then OB g Z\ by (MOD) we 
obtain B > D G Z , whence D gY . 

For (ii), let RXqY and RXqZ, we show that RYZ holds. Suppose D ^ Z, 
then OD ^ Xq, then -^OD G Xq then O—^D G Xq, so that G Xq by (U5). 

Then O^D G Y, and this implies that OB ^ Y. 

Finally one, we can show that 

(hi) Xo G Wo. 

To this regard, if nD G Xq, then T > B G Xq hy (Mod) and B G Xq hy 
(REEL). 

We can now define a structure Mo = (Wo, /o, [D]^°)> where 

/o(R, ^) = f{B, Z) and [[p]]^° = [[p]]^ O Wo, 

in particular the definition of / is correct by virtue of (i). 

Fact 3 Mo satisfies all conditions of definition 2, in particular Mo satisfies 
the condition (UNIV). In order to check (UNIV), let D be a modal formula and 
suppose that for all formulas B and Z G Wo f{B, Z) n = 0, in particular 

we have f{B,Z) n = 0; this implies f{B,Z) — 0, that is D > T G 

whence 0-.B G Z. By (ii) we have that for every F G Wo, RZY holds, thus 
-nB G F, i.e. [[D]]^° = 0. 

Fact 4 For each formula C, P| Wo. This is proved by induc- 

tion on the form of C, the details are left to the reader. 

We can now conclude the completeness proof. If A is not a theorem of BC, 
then Xq ^ [[^]]^- Since [[^]]^° = and Xq g Wo, we have that 

Wo — yf 0, which shows that A is not true in Mo. 

□ 




A Conditional Logic for Belief Revision 



303 



4 Conditionals and Revision 

The capability of defining a belief operator through conditional implication is 
central to our way of modeling belief revision. Given a belief set K, we will 
represent it by a set of belief formulas THk- all the formulas C in K are believed 
while all formulas not in K are disbelieved in THk- More precisely, we define 

ThK = {T > C : C G AT} U {-(T > C) : C ^ K}. 

Checking if B belongs to the revised belief set K * A will correspond, in our 
logic, to evaluate a conditional d > i? at all worlds satisfying the theory Thx, 
that is, at all worlds whose corresponding belief set is K. 

Before providing a representation theorem which establishes a precise corre- 
spondence among belief revision systems and our BC structures, let us give an 
intuitive idea of the relationship between ACM postulates and the axioms of our 
logic (or, equivalently, the semantic properties of BC structures). 

Let us consider a single world Wk at which THk holds in a given BC structure. 
Roughly speaking, the worlds in the equivalence class of Wk are the classical 
interpretations of K. If we want to check if B belongs to the revised belief set 
K * A, we can evaluate the conditional d > i? at wk- Then, the new belief set 
K * A will be represented by the set of worlds J{A,wk), the set of the most 
preferred A- worlds with respect to the world wk- Moreover, we can represent 
the belief set K + A, obtained by an expansion of K with d, as the subset of 
J{T,wk) satisfying d, namely J{T,wk) H [[d]]. More precisely, given a belief 
set K and some BC-structure M, and a world wk such that Wk \=m Thx, we 
can define 

K * A = {B : Wk \=m A> B} and 

K + A = {B: WK \=mT>{A-^B)}. 

The below representation theorem will show that the revision operator * 
defined above satisfies the AGM postulates. Here we provide some examples to 
show how BC-structures relate to the AGM postulates. 

Let us consider postulate (K4) : if ->d ^K,K+ACK* A. From ->d ^ K, 
by definition of ThK, we have that THk \=m “'(T > ~'d), and hence wk \=m 
->(T > -'d). Moreover, from B & K + A we get Wk \=m T > (d ^ B). From 
the following consequence of axiom (CV) -'(T > ^d) A (T > (d ^ B)) 

((T A d) > (d ^ B)) we get, by (RCK), wk \=m d > (d — > B). Hence, by 
using (ID), (RCK) and propositional reasoning, we can conclude Wk \=m A> B, 
and therefore B G K * A. 

Let us consider postulate (K5): {K * A) = K± only if I ^d. Assume that 

{K *A) — Kj_, then T G (K * d) and Wk \=m d > T. Therefore, /(d, wk) — 0 - 
As a consequence of (MOD) and (UNIV), we can conclude that [[d]] = 0 . From 

this we can conclude that I <A only if the model M under consideration has the 

property that for every satisfiable formula d there is a world in the model which 
satisfies it. Hence, we will make use of such a condition in the representation 
theorem below. 
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We say that a BC-structure M = {W, /, [[]]) satisfies the covering condition 
if, for any formula A G C satisfiable in PC, [[A]] ^ 0 (i.e., there is some world sat- 
isfying A in M). The following representation theorem describes the relationship 
between our logic BC and belief revision. 

Theorem 5 (Representation theorem). 

(1) Given a belief revision system (K, *) such that the revision operator * sat- 
isfies postulates K1-K8, there exists a BC-structure M* such that for each 
consistent belief set K in K, and A,B e £, 

B G K * A if an only if ThK \=m, A> B. 

(2) Given a BC-structure M = {W, f, [[]]) which satisfies the covering condition, 
there is a belief revision system (Km, *m) such that Km = {K C C : K — 
Cn{K) and [[T/iii-]] ^ 0} and, for each belief set K of Km, and A,B G L, 

B G K *M A if an only if wk \=m A> B, 

for some Wk such that Wk \=m ThK- 

Proof. (Sketch) 

To prove part (1), we define a BC-structure M* = (W, /, [[]]) as follows: 

W = {(K, iv): WG KGKandw \=pc K}; 

Ck = {{,K',w) GW: K' = K}^; 

[[p]] = w) gW : w [=pc p}, for all propositional letters p G C] 



f{A,{K,w)) and [[A]] can be defined by double induction on the structure of 
the formula A. At each induction step, for each connective o, [[Ao B]] is defined 
by making use of the valuation of the subformulas ([[A]] and [[R]]) and of the 
selection function for subformulas (for instance, f{A,w)); moreover, f{AoB,w) 
is defined by possibly making use of the valuation of the formula Ao B itself. In 
particular we let: 

f{A,{K,w)) = CK*A, AAgC; 

/(A, {K, w)) = Ck*<Pai if a ^ £ and there exists a formula € £ 
such that [[A]] = 

/(A, {K, w)) = 0, otherwise. 

By making use of the properties of the revision operator *, we can show that 
M* is a BC-structure, Furthermore, it can be easily shown that the model M* 
satisfies the condition: B G K * A if an only if ThK \=m^ A > R, by making 
use of the crucial property that: if {K',w') \=m ThK, then K' = K. 

To prove part (2), we define a belief revision system (Km, *m) with Km = 
{K : K = Cn{K) and [[T/iic]] 0} and the revision operator *m as follows: 

K * A = {R G £ : Wk \=m A > R}, 



^ In particular, for the inconsistent belief set Kj_, we take Ckj^ = 0. 
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for some wk such that wk \=m Thx- It can be easily shown that the revision 
operator *m satisfies postulates (K1)-(K8). For postulates (K4) and (K5) we 
refer to the intuitive explanation above. We consider the remaining cases. 

Postulate (Kl): we have to show that K * A is deductively closed. Let C G 
£ be a logical consequence of K * A. Then, by the compactness of classical 
propositional logic, there is a finite set of formulas /3i , . . . , € K * A such that 

/3i A . . . A /3„ ^ C is a tautology. From the definition of iC * A we get that 
Wk \=m a > pi for &\\ i = 1, . . . ,n and, by propositional reasoning and (RCK), 
Wk \=m A> C. Hence, C G K * A. 

Postulate (K2): we have to show that A G K * A. This follows from the fact 
that Wk \=m A> a holds, by axiom (ID). 

Postulate (K3) : we have to show that {K * A) C [K + A). Let us assume that 
B G {K * A). Then wk \=m A > B and, by propositional reasoning, wk \=m 
(T A H) > B. Hence, by applying (DT), we conclude Wk \=m T > (H — > B), 
which means that B G {K + A). 

Postulate (K6): we have to show that if h H <-> H then K*A — K*B. Assume 
that A and B are two equivalent formulas. Then, by (RCEA), A > C ^ B > C 
is valid. Hence, from wk \=m A > C we can conclude wk \=m B > C, and vice 
versa. Therefore C G K * A if and only if C G K * B. 

Postulate (K7) can be shown to hold by making use of axiom (DT) in a way 
similar to case (K3), while postulate (K8) can be proved by making use of axiom 
(CV) in a way similar to case (K4). 

□ 

Notice that the requirement of consistency of a belief set in the Representation 
Theorem, part (1), is needed since an inconsistent belief set K± cannot be rep- 
resented by a world in a model, due to the presence of the semantic property 
(REFL). 

Following [6] we say that a logic is non-trivial if there are at least four formulas 
A, B, C and D, such that the formulas AaB, BAC,CAA are inconsistent, 
and the formulas A A D, BAD, C A D are consistent. Otherwise the logic is 
trivial. 

Theorem 6. The logic BC is consistent and non-trivial. 

Intuitively, the reason why Gardenfors’ Triviality Result does not apply to our 
logic is that we have adopted a weaker formulation of the Ramsey test which does 
not enforce monotonicity. We can notice, indeed, that the theory THk depends 
nonmonotonically on K. 

In this logic it is also possible to model iterated revision by nested condition- 
als. For instance, to check if B belongs to {K * Hi) * H 2 we will evaluate the 
conditional Hi > (H 2 > R) at a world wk associated with the belief set K. In 
order to evaluate a formula such as Hi > (H 2 > B) in wk , we have to evaluate 
H 2 > R at worlds whose belief set is given by iL * H. Hence, the belief set asso- 
ciated to every world in f{Ai,WK) ( which are the most preferred Hi-worlds for 
Wk) must be K * A\. 
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Our logic may be used to model iterated revision by making use of nested 
conditionals. However, we can only treat the case of revision of a consistent belief 
set K with a sequence of consistent formulas. The assumption that the formulas 
added by successive revisions are consistent is common to other proposals, as for 
instance [9] . In our logic, the revision of a belief set K with an inconsistent for- 
mula results in an empty set of models, that is, if Wk is a world whose associated 
belief set is given by K and A is an inconsistent formula, then f{A, wk) has to 
be empty. This means that for any formula B, the formula A> B \s true at wk- 
In particular, for every B and C, A > C > B is true at Wk- This essentially 
means that, once we enter in an inconsistent state we cannot get out of it. If we 
want to lift the assumption of consistency, a possible solution requires to modify 
the logic by removing axiom (REFL). 

5 Conclusions and Related Work 

In this paper we have introduced a conditional logic BC, which is well suited to 
model belief revision. Belief sets can be given a representation in this logic and 
can be associated with worlds by making use of the conditional operator itself. 
In this way we can make the evaluation of conditional formulas in one world 
dependent on the belief set holding at that world. 

Our belief operator has some similarities with the necessity operator 
introduced in [12], which, however, is parametric with respect to a knowledge 
base K. In [12] the satisfiability of a formula is defined with respect to the model 
associated with a given belief set K, whereas the revision function is external to 
models and applies to models. As a difference, since the aim of our proposal is 
to depart as little as possible from standard conditional logics and their model 
theory, we incorporate the revision function in the models of our logic, namely 
in the selection function. 

As a difference with conditional logic and also with our approach, Friedman 
and Halpern consider states rather than worlds as their primitive objects. The 
conditional language they define is built up using a conditional operator > 
and a belief operator B, and it contains only subjective formulas, that is those 
formulas formed out by boolean combinations of conditional formulas and belief 
formulas. Such language is completely disjoint from the language £ contain- 
ing the objective formulas (that is formulas which contain neither conditionals, 
nor belief operators), so that, for instance, a formula as Af\ {A> B), with A G £, 
is not allowed. Whereas objective formulas provide the belief assignment at a 
state, only subjective formulas are evaluated at a given state. Moreover, only 
objective formulas are allowed in the left hand side of a conditional, that is, in 
the antecedent <P has to be an objective formula. 

In contrast, we do not need to take as primitive a notion of epistemic state, 
nor an additional belief modality. Moreover, we do not put syntactic restrictions 
on the language and, in particular, we allow free occurrences of conditionals. We 
see that a syntactic restriction on the language is not needed, and it is sufficient 
to impose restrictions on some axioms: namely we require that (DT), (CV) only 
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hold for the formulas in L. On the contrary, we make use of the conditional 
operator itself in order to associate a world with a belief set, and we impose 
conditions on the selection function in such a way that the the selection function 
is defined in a world according to the belief set associated with that world. 

In [8] Katsuno and Satoh present a unifying view of nonmonotonic reason- 
ing, belief revision and conditional logic based on the notion of minimality. More 
precisely, they introduce ordered structures and families of ordered structures 
as a common ingredient. Ordered structures are triples {W,<,V) containing a 
set W of worlds, a preorder < and a valuation function V . They provide a se- 
mantic model to evaluate those conditional formulas that contain no nesting 
of >. Families of ordered structures are defined as collections of ordered struc- 
tures, and their axiomatization corresponds to well known conditional logics, as 
VW, VC, and SS. While families of ordered structures are used to give a seman- 
tic characterization to update, ordered structures are used to give a semantic 
characterization to revision. In particular, Katsuno and Satoh show that, given 
a revision operator *, for each belief set K, there is an ordered structure Ok 
(satisfying the covering condition) such that the formulas in K are true in all 
minimal worlds of Ok (written Ok H 

K*A = {B: Ok\=A>B}. 

Since an ordered model Ok contains a single ordering relation <, it can only 
represent a single belief set K and its revisions. Moreover, since ordered struc- 
tures do not handle nested conditionals, iterated revision cannot be captured in 
this formalization. As a difference, we are able to represent different belief sets 
and their revisions within a single structure, by associating worlds with belief 
sets. To this purpose, we make use of more complex semantic structures, which 
might be seen as a particular type of families of ordered structures. 

We argue that our logic is not only well suited for modelling iterated revision, 
but that it also can provide a suitable framework in which to capture other forms 
of belief change, as belief update. This possibility is suggested by the fact that, 
when A ^ (T > A) is added to our axiomatization, from (REFL) we obtain the 
equivalence A ^ {T > A), and the belief set associated with a world becomes 
equal to the set of formulas true in that world. This is what we expect to happen 
with updates. In such a case, some of our axioms become tautological while other 
axioms coincide with the axioms of the conditional logic presented in [6] to deal 
with updates. Obviously, to deal with updates some of the axioms of BC, which 
is tailored for revision should be dropped, since they are not required for belief 
update. 




308 



Laura Giordano, Valentina Gliozzi, and Nicola Olivetti 



References 

1. G.E. Alchourron, P. Gardenfors, D. Makinson, On the logic of theory change: partial 
meet contraction and revision functions, in Journal of Symbolic Logic, 50:510-530, 
1985. 

2. N. Friedman, J.Y. Halpern, Gonditional Logics of Belief Change, in Proceedings of 
the National Conference on Artificial Intelligence (AAAI 94):915-921, 1994. 

3. P. Gardenfors, Belief Revision and the Ramsey Test for Conditionals, in The Philo- 
sophical Review, 1996. 

4. P. Gardenfors, Belief Revision, in D. Gabbay (ed.). Handbook of Logic in Artificial 
Intelligence, 1995. 

5. P. Gardenfors, Knowledge in flux: modeling the dynamics of epistemic states, MIT 
Press, Cambridge, Massachussets, 1988. 

6. G. Grahne, Updates and Counterfactuals, in Proceedings of the Second International 
Conference on Principles of Knowledge Representation and Reasoning (KR’91), pp. 
269-276. 

7. H. Katsuno, A.O. Mendelzon, On the Difference between Updating a Knowledge 
Base and Revising it, in Proceedings of the Second International Conference on 
Principles of Knowledge Representation and Reasoning, 1991 

8. H. Katsuno, K. Satoh, A unified view of consequence relation, belief revision and 
conditional logic, in Proc. 12th International Joint Conference on Artificial Intelli- 
gence (IJCAr91), pp. 406-412. 

9. D. Lehmann, Belief revision revised, in Proc. 14th International Joint Conference 
on Artificial Intelligence (IJCAr95), pp. 1534-1540. 

10. I. Levi, Iteration of Conditionals and the Ramsey Test, in Synthese: 49-81, 1988. 

11. D. Lewis, Counterfactuals, Blackwell, 1973. 

12. W. Nejdl, M. Banagl, Asking About Possibilities- Revision and Update semantics 
for Subjunctive Queries, in Lakemayer, Nebel, Lecture Notes in Artificial Intelli- 
gence: 250-274, 1994. 

13. D. Nute, Topics in Conditional Logic, Reidel, Dordrecht, 1980. 

14. D. Nute, Conditional Logic, in Handbook of Philosophical Logic, Vol. II, 387-439, 
1984. 

15. F.P. Ramsey, in A. Mellor (editor). Philosophical Papers, Cambridge University 
Press, Cambridge, 1990. 

16. R. Stalnaker, A Theory of Conditional, in N. Rescher (ed.). Studies in Logical The- 
ory, American Philosophical Quarterly, Monograph Series no. 2, Blackwell, Oxford: 
98-112. 




Implicates and Reduction Techniques for 
Temporal Logics* 



Inman P. de Guzman, Manuel Ojeda- Aciego, and Augustin Valverde 

Dept. Matematica Aplicada, Universidad de Malaga, Spain 
{guzman, aciego ,a_valverde}@ctima.uiiia. es 



Abstract. Reduction strategies are introduced for the future fragment 
of a temporal propositional logic on linear discrete time, named FNext. 
These reductions are based in the information collected from the syn- 
tactic structure of the formula, which allow the development of effi- 
cient strategies to decrease the size of temporal propositional formulas, 
viz. new criteria to detect the validity or unsatisfiability of subformulas, 
and a strong generalisation of the pure literal rule. These results, used as 
a preprocessing step, allow to improve the performance of any automated 
theorem prover. 



1 Introduction 

The temporal dimension of information, the change of information over time and 
knowledge about how it changes has to be considered by many AI systems. There 
is obvious interest in designing computationally efficient temporal formalisms, 
specially when intelligent tasks are considered, such as planning relational ac- 
tions in a changing environment, building common sense reasoning into a moving 
robot, in supervision of industrial processes, .... 

Temporal logics are widely accepted and frequently used for specifying con- 
current and reactive agents (which can be either physical devices or software 
processes), and in the verification of temporal properties of programs. To verify 
a program, one specifies the desired properties of the program by a formula in 
temporal logic. The program is correct if all its computations satisfy the formula. 
However, in its generality, an algorithmic solution to the verification problem is 
hopeless. For propositional temporal logic, checking the satisfiability of a formula 
can be done algorithmically, and theoretical work on the complexity of program 
verification is being done [3] . The complexity of satisfiability and determination 
of truth in a particular finite structure are considered for different propositional 
linear temporal logics in [7] . 

Linear-time temporal logics have proven [5] to be a successful formalism 
for the specification and verification of concurrent systems; but have a much 
wider range of applications, for instance, in [2] a generalisation of the temporal 
propositional logic of linear time is presented, which is useful for stating and 
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COST-15: Many-valued logics for computer science applications. 
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proving properties of the generic execution sequence of a parallel program. On 
the other hand, relatively complete deductive systems for proving branching time 
temporal properties of reactive systems [4] have been recently developed. 

In recent years, several fully automatic methods for verifying temporal spec- 
ifications have been introduced, in [6] a tableaux calculus is treated at length; a 
first introduction to the tableaux method for temporal logic can be seen in [8] . 
However, the scope of these methods is still very limited. Theorem proving pro- 
cedures for temporal logics have been traditionally based on syntactic manip- 
ulations of the formula A to be proved but, in general, do not incorporate the 
substitution of subformulas in A like in a rewrite system in which the rewrite re- 
lation preserves satisfiability. One source of interest of these strategies is that can 
be easily included into any prover, specifically into those which are non-clausal. 

In this work we focus on the development of a set of reduction strategies 
which, through the efficient determination and manipulation of lists of unitary 
implicant and implicates, investigates exhaustively the possibility of decreasing 
the size of the formula being analysed. The interest of such a set of reduction 
techniques is that the performance of a given prover for linear-time temporal 
logic can be improved because the size of a formula can be decreased, at a 
polynomial cost, as much as possible before branching. 

Lists of unitary models, so-called Z\-lists, are associated to each node in 
the syntactic tree of the formula and used to study whether the structure of 
the syntactic tree has or has not direct information about the validity of the 
formula. This way, either the method ends giving this information or, otherwise, 
it decreases the size of the problem before applying the next transformation. So, 
it is possible to decrease the number of branchings or, even, to avoid them all. 

The ideas in this paper generalise the results in [1], in a self-contained way, 
by explicitly extending the reduction strategy to linear-time temporal logic and, 
what is more important, by complementing the information in the Z\-lists by 
means of the so-called Z\-sets. The former allow derivation of an equivalent and 
smaller formula; the latter also allow derivation of a smaller formula, not equiv- 
alent to the previous one, but equisatisfiable. 

The paper is organised as follows: 

— Firstly, preliminary concepts, notation and basic definitions are introduced: 
specifically, it is worth to note the definition of literal and the way some of 
them will be denoted. 

— Secondly, Z\-lists, our basic tool, are introduced; its definition integrates some 
reductions into the calculation of the Z\-lists. The required theorems to show 
how to use the information collected in those lists are stated. 

— Later, the Z\-sets are defined and results that use the information in these 
sets are stated. One of these is a generalisation of the pure literal rule. 



2 Preliminary Concepts and Definitions 

In this paper, our object language is the future fragment of the Temporal Propo- 
sitional Logic FNext with linear and discrete flow of time, and connectives ^ 
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(negation), A (conjunction), V (disjunction), F (sometime in the future), G 
(always in the future), and 0 (tomorrow); V denotes the set of propositional 
variables p, q, r, ... (possibly subscripted) which is assumed to be completely 
ordered with the lexicographical order, e.g. < qm for all n, m, and < Pm if 
and only if n < m. Given p G V, the formulas p and are the classical literals 
on p. 

Definition 1. Given a classical propositional literal £, the temporal literals^on £, 
denoted Lit(f), are those wff of the form F©"£, G®‘^£, FG£, GF£ for all 
n e N. 

The notion of temporal negation normal formula, denoted tnnf, is recursively 
defined as follows: 

1. Any literal is a tnnf. 

2. If A and B are tnnf, then Ay B and A A B are tnnf, which are called 

disjunctive and conjunctive tnnf, respectively. 

3. If A is a disjunctive tnnf, then GA is a tnnf. 

4. If A is a conjunctive tnnf, then FA is a tnnf. 

5. A formula is a tnnf if and only if it can be constructed by the previous rules. 
For formulas in tnnf, we will write p for the classical negated literal —^p. 

As usual, a clause is a disjunction of literals and a cube is a conjunction of 
literals. In addition, a G-clause is a formula GB where F is a classical clause, 
and a F-cube is a formula FB in which F is a classical cube. 

We denote £)£ to mean a temporal literal on £, where £) is said to be its tem- 
poral prefix; if £)£ is a temporal literal, then denotes the number of temporal 
connectives in £), and £t£ denotes its opposite literal, where F = G, G — F, 
TG = GF, GF = FG and © = 0 

The transformation of any wff into tnnf is linear by recursively applying the 
transformations induced by the double negation, the de Morgan laws and the 
equivalences in Fig. 1. 



— i©A = ©— lA 


©FA 


= F©A 


©GA 


= G©A 


FFA = F©A 


GGA 


= G©A 


FGFA 


= GFA 


GFGA = FGA 


FG®A 


= FGA 


GF©A 


= GFA 


©VAi = v®A 


© A Ai 


= A ®Ai 


~^FA 


= G~^A 


J 

III 

A 

J 






G(A.c,A0 


= AigjGA, 



Fig. 1. 



In addition, by using the associative laws we will consider expressions like 
Ai V • • • V A„ or Ai A • • • A A„ as formulas. 

^ As we will be concerned only on temporal literals, in the rest of the paper we will 
drop the adjective temporal. In addition, we will use the notation ©" to denote the 
n- folded application of the connective ©. 
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We will use the standard notion of tree and address of a node in a tree. Given 
a tnnf A, the syntactic tree of A, denoted by Ta, is defined as usual. An address 
rj in Ta will mean, when no confusion arises, the subformula of A corresponding 
to the node of address r] in Ta; the address of the root node will be denoted e. 

If Tc is a subtree of Ta, then the temporal order of Tc in Ta, denoted 
ordA{C), is the number of temporal ancestors of Tq in Ta- 

We will also use lists with its standard notation, nil, for the empty list. 
Elements in a list will be written in juxtaposition. 

If a and fd are lists of literals and is a literal, t)£ G a denotes that is 

an element of a; and a C fi means that all elements of a are elements of /?. If 

a = • • ■'dn£n, then a = -diG ^?2^2 • • - '&n£n- 

Definition 2. A temporal structure is a tuple S = (N, <, h), where N is the set 
of natural numbers, < is the standard strict ordering on N, and h is a temporal 
interpretation, which is a function h : L — > 2^, where L is the language of the 
logic, satisfying: 

1. h{^A) = N \ h{A); h{A V B) = h{A) U h{B) 

2. h{A ^B) = {N\ h{A)) U h{B); h{A A B) = h{A) n h{B) 

3. t G h{F A) iff t' exists with t < t' and t' G h{A) 

4. t G h{GA) iff for all t' with t < t' we have t' G h{A) 

5. t G h{®A) iff we have t + 1 G h{A) 

A formula A is said to be satisfiable if there exists a temporal structure 
S = (N, <, h) such that h{A) ^ 0; if t G h{A), then h is said to be a model of 
A in t; if h{A) = N, then A is said to be true in the temporal structure S; if A 
is true in every temporal structure, then A is said to be valid, and we denote it 

h 

Formulas A and B are said to be equisatisfiable if A is satisfiable iff B is 
satisfiable; = denotes the semantic equality, i.e. A = i? if and only if for every 
temporal structure S = (N, <,h) we have that h{A) = h{B); finally, the symbols 
T and _L mean truth and falsity, i.e. h{T) = N and h{JJ) = 0 for every temporal 
structure S = (N, <,h). 

If El and Ej are sets of subformulas in A and X and Y are subformulas, then 
the expression A[Ei/A, E2/F] denotes the formula obtained after substituting 
in A every occurrence of elements in Ei by A and every occurrence of elements 
in E2 by Y . 

If rj is an address in Ta and X, then the expression A [77/ A] is the formula 
obtained after substituting in A the subtree rooted in 77 by A. 

3 Adding Information to the Tree: A-lists 

The idea underlying the reduction strategy we are going to introduce is the use 
of information given by partial assignments. We associate to each tnnf A two 
lists of literals denoted Ao(A) and Ai(A) (the associated Z\-lists of A)^ and two 

^ It can be shown that either A is equivalent to a literal, or at most one of these lists 
is non-empty. 
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sets of lists, denoted Aq{A) and Ai{A), whose elements are obtained out of the 
associated Z\-lists of the subformulas of A. 

The Z\-lists and the Z\-sets are the key tools of our method to reduce the size 
of the formula being analysed. These reductions allow to study its satisfiability 
with as few branching as possible. 

In a nutshell, Aq{A) and ^i(^) are, respectively, lists of temporal implicates 
and temporal implicants of A. The purpose of these lists is two-fold: 

1. To transform the formula A into an equivalent and smaller-sized one (see 
Sect. 3.3). 

2. To be used in the definition the Z\{, sets (see Sect. 4), which will be used 
to transform the formula A into an equisatisfiable and smaller-sized one. 
Furthermore, information to build a countermodel (if it exists) is provided. 

The sense in which we mean temporal implicant/implicate is the following: 

Definition 3. 

— A literal is a temporal implicant of A if \= tti ^ A. 

— A literal is a temporal implicate of A if ^ A ^ M. 



3.1 The Lattices of Literals 

Definition 4. For each classical propositional literal £ we define an ordering in 
Lit(f') U {T, T} as follows: 

1. -d£ < g£ if and only if \= £}£ ^ g£ 

2. < T for all (possibly empty) £). 

3. > T for all (possibly empty) £). 

Each set Lit(f') U {T, T} provided with this ordering is a lattice, depicted in 
Figure 2. For each literal £)£ we will consider its upward and downward closures, 
denoted '&£'[ and '&£l. 

3.2 Definition of the .^-lists 

Definition 5. Given a tnnf A, we define Aq(A) and Ai(A) to be the lists of 
literals recursively defined below 

Ao{M) = Ai{M) = M 

^0 (A"=i ^i) = UnionA(Ao(Ai), . . . , Z\q(A„)) 

^0 (V”=i == Intersection(Z\o(Ai), . . . , Z\q(A„)) 

^1 (A"=i ^i) = Intersection(Z\i(Ai), . . . , Z\i(A„)) 

^1 (V”=i ^i) = Unionv(Ai(Ai), . . . , Z\i(A„)) 

Ab (FA) = Add^i Ab{A)) 

Ab {G A) = Addai Ab{A)) 
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GFi 

t 

FGl- 



Fig. 2. Lattice Lit(^) U {_L, T} 

The description of the operators involved in the definition above is the fol- 
lowing: 

1. The operators Add add a temporal connective to each element of a list of 
literals and simplify the results to a tnnf according to the rules in Fig. 1. 

2. The two versions of Union arise because of the intended interpretation of 
these sets: 

(a) Elements in Aq are considered as conjunctively connected, so we use 
UniouA- This way, we obtain minimal implicates. 

(b) Elements in Z\i are considered as disjunctively connected, so we use 
Uniony. This way, we obtain maximal implicants. 

Remark 6. By conjunctively connected, we mean that two literals in Z\q are 
substituted by its conjunction if it is either a literal or T or T, i.e. 

-M A = T and the pair of literals and ©"+^1' is simplified to G©"!*, 

for all n. 

Similarly, the disjunctive connection in Z\i means the application of the fol- 
lowing rules -d£ V £}£[ = £}£, £}£ V §t\ — T, and the pair of literals F©"+^t' and 
©rt-i-i^ is simplified to in Z\i, for all n. 

It is easy to see that, for all t, we have that Ai,{A) nLit(f') contains at most 
one literal in the set {F®^£, FGi, GF£} and, possibly, several of the type 

®^£. 

Definition 7. If a A is a tnnf, then to A-label A means to label each node r] in 
A with the ordered pair {Ao{r]), Ai{r])). 

Example 8. Consider the formula A = (-'pV ^GqV rV G(-'SV ~^qV u)) A -'(^pV 
^Gq V r V G(-'S V u)); the Z\-labelled tree of A is^ 

For the sake of clarity, the Z\-labels of the leaves are not written. 




3 
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A (pGqrFsFu, nil ) 




V {nil, pFqrGsGu ) p Gq r F{FsFu, nil ) 




S q U 

Note that in node 1, literals Fq and Gq are collapsed into Fq, because of the 
disjunctive connection in Z\i. 

Example 9. Let us study the validity of A = G{~^p ^ p) ^ {^Gp Gp). The 
Z\-labelled tree equivalent to ~^A is 



A (T, nil ) 




G{Gp,Gp) Fp Fp 



V (p,p) 




p p 

In this case, Ao{s) = T, because of the simplification of Gp and Fp due to 
the conjunctive nature of the Z\o-sets. We will see later that L\o(e) = T implies 
that the input formula, that is ~^A, is unsatisfiable, therefore A is valid. 

3.3 Information in the .^-lists 

As indicated above, the purpose of defining Aq and Z\i is to collect implicants 
and implicates of A, as shown in the following theorem. 

Theorem 10. Let A be a tnnf, 

1. If f}£ € Aq{A), then \= A ^ i9£. 

2. If D£ G Ai{A), then =D£ ^ A. 

The theorem above will be used in the following equivalent form: 

1. If D£ G Aq{A), then A = A Ai9£. 

2. If £}£ G Ai{A), then A = Av D£. 

As a literal is satisfiable, by Theorem 10 item 2, we have the following result: 



Corollary 11. If Ai{A) yt nil, then A is satisfiable. 
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3.4 Strong Meaning-Preserving Reductions 

A lot of information can be extracted from the Z\-lists as corollaries of Theo- 
rem 10. The first result is a structural one, for it says that either one of the 
Z\-lists is empty, or both are equal and singletons. 

Corollary 12. If A is not a literal and ^i(^) nil ^ Aq{A), then there 
exists M sueh that ^i(^) = ^o(^) = Such tnnf A is said to be Dl-simple. 

The corollary below states conditions on the Z\-lists which allow to determine 
the validity or unsatisfiability of the formula we are studying. 

Corollary 13. Let A be a tnnf, then 

1. (a) If A(j{A) = T, then A = A. 

(b) If A = Ai in which a conjunct is a clause such that Z\i(Aig) C 
Aq{A)] , then A = ±. 

(c) If A = /\^-iAi in which a conjunct is a G-clause GB such that 
Add®(Z\i(R)) C Z\o(A)T, then A = T. 

2. (a) IfAi{A) = T, then A = T. 

(b) If A = V”=i which a disjunct A^^ is a cube such that Z\o(Aig) C 

Ai(A)i, then A = T. 

(c) If A — Vr=i which a disjunct A^^ is an F-cube FB such that 

Adde(Z\o(i?)) C Z\i(A)i, then A = T. 

The following definition gives a name to those formulas which have been 
simplified by using the information in the Z\-lists. 

Definition 14. Let A be an tnnf then it is said that A is: 

1. finalizable if either A = T, or A = T or Ai(A) ^ nil. 

2. A tnnf verifying either (a) or (b) or (c) of item 1 in Corollary 13 is said to 
be Z\o-conclusive. 

3. A tnnf verifying either (a) or (b) or (c) of item 2 in Corollary 13 is said to 
be Z\i-conclusive. 

4. A tnnf A is said to be A-restricted if it has no subtree which is either Aq- 
conclusive, or Ai-conclusive, or 'dCsimple. 

5. To A-rcstrict a tnnf A means to substitute each Ai-conclusive formula by T, 
each Ao-conclusive formula by T, and each tJAsimple formula by and then 
eliminate the constants T and T by applying the 0-1 laws. 

Note that A-restricting is a meaning-preserving transformation. 



Example 15. Given the transitivity axiom A = FFp — > Fp] the tnnf equivalent 
to -lA is F®pA Gp; since Ao{F®pAGp) = T, we have that ~^A is Ao-conclusive, 
therefore ->A is unsatisfiable and A is valid. 



Example 16. Given the formula A = ®p A®Fp AG{p Fp), its A-labelled tree 
is 
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A(©pF©p,nil) 




(Bp F(Bp G (nil, GpGFp) 
V (nil, pFp ) 




p Fp 



This tree is Z\o-conclusive, since Add0(Z\i(31)) = (BpF(Bp C Z\o(e)T. In fact, 
what we have in this example is Add0(Z\i(31)) = ^o(^) 

3.5 Weak Meaning-Preserving Reductions 

The aim of this section is to give more general conditions allowing to use the 
information in the Z\-lists which has not been able to be used by the strong 
reductions. Specifically, a strong reduction uses the information in the Z\-lists 
in a strong sense, that is, to substitute a whole subformula by either T, or T, 
or a literal. As in the propositional case, sometimes this is not possible and we 
can only use the information in a weak sense, that is, to decrease the size of the 
formula by eliminating literals depending on the elements of the Z\-lists. 

The following notation is used in the statement of some results hereafter: 

— If 5 is a set of literals in a tnnf A, then denotes the set of all the occur- 
rences of literals G 5 of temporal order 0 in A 

— Lit(f', n) = {rj \ rj = tti and + ordA(il) > n + 1} 

— Lit(Z, n) = {t] \ T] = M and |t?| + ordA(rj) > n + 1} 

Theorem 17. Let A be a tnnf and a literal in A: 

1. If Me Ao(A), then A = MA A[(M^)°/T, (^[f/F] 

2. If M e Ai(A), then A = MV A[(Mif / F, (M)°/T] 

This theorem cannot be improved for an arbitrary literal M] although, for 
some particular cases, it is possible to get more literals reduced, as shown by the 
following theorem, which generalises the result in Theorem 17, by dropping the 
restriction of order 0 for all the literals in the upward/downward closures. 

Theorem 18. Let A be a tnnf, 

1. If Me Ao(A) with M e {FG£, GF£} U {G©”£ | n G N}, then 

A = MaA[M^ /T,Ml /F] 

2. If M e Ai(A) with M e {FG£, GF£} U {F©”t' | n G N}, then 



A = MyA[Mi /F,M^ /T] 
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Finally, the theorem below states a number of additional reductions that can 
be applied when -di equals either G©"t' or 

Theorem 19. Let A be a tnnf and §£ a literal in A: 

1. If G £ € Ao{A) , then A = G £ A A[Lit{£,n)/T,Lit(I,n)/l-] 

2. If £ G Ai{A), thenA = F®^£vA[Lit{£,n)/±,Lit(£,n)/T] 

4 Adding Information to the Tree: A-sets 

In the previous sections, the information in the Z\-lists has been used locally, that 
is, the information in Ai,(rf) has been used to reduce rj. The purpose of defining 
a new structure, the Z\-sets, is to allow the globalisation of the information, in 
that the information in Ah{r]) can be refined by the information in its ancestors. 

Given a Z\-restricted tnnf A, we define the sets Aq{A) and Ai{A), whose 
elements are pairs (a,?y) where a is a reduced Z\-list (to be defined below) asso- 
ciated to a subformula B of A, and rj is the address of B in A. These sets allow 
to transform the formula A into an equisatisfiable and smaller sized one, as seen 
in Section 4.1. 

The following result uses those cases in Theorems 17, 18 and 19 which allow 
to delete a whole subformula. The rest of possibilities only allow to delete literals; 
these literals will be called reducible. 

Theorem 20. Let A be a tnnf, B a subformula of A, and rj the address in the 
tree of A of a subformula of B: 

1. (a) If £}£ is any literal satisfying D£ G Z\o(??)T n (Z\i(i?) U Z\o(i?)) and 

ordBiv) = 0; then A = A[r]/±]. 

(b) If Di G {FG£, GF£} U {G©"£ | n G N} and satisfies and M G Z\o(?y)T n 
(Ai{B) LI Ao{B)) , then A = A[r]/±]. 

(c) If£}£ G Z\o(?y)T, and G Ai{B)U Ao{B), and |r?| +ordB(r]) >n+\, 
then A = A[r]/ A]. 

2. (a) If M is any literal satisfying §£ G Ai{r])i n {Aq{B) U Z\i(i?)) and 

ordBiv) = 0; then A = A[r]/T]. 

(b) If £}£ G {FG£, GF£} U {G©"£ | n G N} and satisfies and §£ G n 

(Ao{B) L> Ai{B)) , then A = A[ri/T]. 

(c) If Di G Ai{rii)[, and G©"!* G Aq{B) U and +ordB{ri) > n + 1, 

then A = A[r]/T] 

This theorem can be seen as a generalisation of Corollary 13, in which a sub- 
formula B can be substituted by a constant even when that subformula is not 
equivalent to that constant. 

The subformula at address ry in A is said to be 0-conclusive in A (resp. 1- 
conclusive in A) if it verifies some of the conditions in item 1 (resp. item 2) 
above. 
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Definition 21. Given a tnnf A and an address rj, the reduced Z\-lists for A, 
for b e {0, 1}, are defined below, 

1. If ?7 is 0-conclusive in A, then Aq{ti) = _L. 

2. If rj is I-conclusive in A, then A^{r]) = T. 

3. Otherwise, A^{r]) is the list At{r]) in which the reducible literals have been 
deleted. 

We define the sets Ab{A) as follows 

Af,{A) = {{A^{r]),r]) | ry is a non-leaf address in Ta with Af,{r]) ^ nil} 

If A is a tnnf, to label A means Z\-label A and to associate to the root of A 
the ordered pair (^Aq{A) , Ai{A)Y 

Example 22. From Example 8 we had the following tree 



A (pGqrFsFu, nil ) 




V {nil, pFqrGsGu) p Gq r F{FsFu, nil) 




S q U 



Note that literals p,Fq and r in Ai of node 1 are reducible in A because of the 
occurrence of its duals in Aq of the root. Similarly Gq is also reducible in node 
14, and q is reducible in 141. Therefore, the calculation of the Zl-sets leads to 

Z\o(^) = {{pGqrFsFu, s), {FsFu, 5), {su, 51)} 

Ai{A) = {{GsGu, 1), {GsGu, 14), {su, 141)} 

4.1 Satisfiability-Preserving Results 

In this section we study the information which can be extracted from the Zi-sets. 

Definition 23. Let Zl be a tnnf then it is said that A is restricted if it is A- 
restricted and satisfies the following: 

— There are not elements {-L,r]) in Aq{A). 

— There are not elements {T,f]) in Ai{A). 

Remark 24- A restricted and equivalent tnnf can be obtained by using the 0-1 
laws in conjunction with the elimination of conclusive subformulas in A, accord- 
ing to Theorem 20. 

The following results will allow, by using the information in the Zi-sets, to 
substitute a tnnf A by an equisatisfiable and smaller sized A'. 
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Complete Reduction 

This section is named after Theorem 26, because after its application on a lit- 
eral G0"t', gives an equisatisfiable formula whose only literals in I are of the 
form ©"f. 

Definition 25. A tnnf A is said to be G®'^£- completely reducible if G a 

for (a, e) € Ao(A). 

Theorem 26. If A be a G®'^£- completely reducible tnnf, then A is satisfiable if 
and only if 

B[G®^e/®^+^e A ... A ©"£, V ... V ®"Z] 

where B = A[Lit(£, n) U G®"t'T /T, Lit(Z, n) U F®"Zi/T]. 

Furthermore, if h is a model of B in t, then the interpretation h' such that 
h'{q) = h{q) if q ^ p and h'{p) = h{p) U [t + n + 2, oo) is a model of A in t. 

Example 21. Given the density axiom A = Fp —> FFp; the formula —•A is 
equivalent to the tnnf Fp A G®p. 

We have that Ao{FpAG®p) = FpG®p. Note that, as the conjunction of Fp 
and G®p is not a literal, no simplification can be applied. In addition, its Z\o-set 
is {{FpG®p, e)}, thus ~^A is completely reducible. 

Now applying Theorem 26, we get that is satisfiable if and only if ®p is 
satisfiable. Therefore ~^A is satisfiable, a model being h{p) = [2, oo), h{p) = {!}. 

Example 28. Given the formula A = {Gp A Fq) F{p A q), we have ^A = 

Gp A Fq A G{p V q); its Z\-restricted form is 

A {GpFq, nil ) 




Gp Fq G (nil, GpFq) 
V (nil,M) 




p q 



and its Z\-sets are: 

^o(^) = {{GpFq,s)} MA)={{GpFq,3),(pq,3l)} 

This formula is completely reducible, by an application of Theorem 26, the 
leaf in node 1 is deleted, and node 3 is substituted by Gq. 

The resulting formula is Fq A Gq, which is 0-conclusive and, therefore, un- 
satisfiable. 




Implicates and Reduction Techniques for Temporal Logics 321 



The Pure Literal Rule 

The result introduced here is an extension of the well known pure literal rule 
for Classical Propositional Logic. Existing results in the bibliography allow a 
straightforward extension of the concept of pure literal. Our definition makes 
use of the Zi-sets, which allow to focus only on those literals which are essential 
parts of the formula; this is because reducible literals are not included in the 
ii-sets. 

Definition 29. Let zl be a tnnf. 

1. A classical literal £ is said to be Z\-pure in A if a literal iti occurs in Ao{A) U 
Z\i(A) and no literal on £}'£ occurs in Ao{A) U Ai{A). 

2. A classical literal £ is said to be Z\-fc-pure in A if ®^£ occurs in an {a,!]) G 
Z\o(A)UZ\i(A) with ordA(rf) = 0, does not occur in any (a, rj) G Z\o(A)U 
Z\i(A) with ordAiv) = 0; £^nd for any other literal d£ or £)'£, occurring in 
some element {a,r]) G Ao{A) UZ\i(A), we have |r?| + ordAiv) > k. 

Theorem 30. Let A be a tnnf, £ a A-pure literal in A, and B the formula 
obtained from A by the following substitutions 

1. If {a,r]) G Ao{A) with £}£ G a, then g is substituted by 

( r][Lit{£,n)U G®"^T/T, Lit(Z, n) U F®"Zi/T] if M = G®"^ 

< ji[d£yT,j£i/^ if§£ G {GF£, FG£} 

[ r 7 [(r?t'T)°/T, (t?£i)°/®] otherwise 

2. If {a,r]) G Ai{A) with D£ G a, then 77 is substituted by T. 

Then, A is satisfiable if and only if B is satisfiable. Furthermore, if h is a model 
of B in t, then the interpretation h' such that h'{£') = h{£') if £' ^ £ and 
h'{£) = [t, 00 ) is a model of A in t. 

Theorem 31. Let A be a tnnf, £ a A-k-pure literal in A, and B the formula 
obtained from A by the following substitutions 

1. If {a,r]) G Ao{A) with ®^f G a and ordAiv) = 0; tken -q is substituted by 
q[{®^£^)°/~G{®'^£i)°/±] 

2. If (a,q) G Ai{A) with ®^£ G a, then q is substituted by T 

Then, A is satisfiable if and only if B is satisfiable. Furthermore, if h is a model 
of B in t, then the interpretation h' such that h'{£') = h{£') if £' ^ £ and 
h'{£) = h{£) U {t + k} is a model of A in t. 

Example 32. Following with the formula in Example 22, we had 

^o(-4) = {{pGqr, e), {FsFu, 5), (su, 51)} 

Ai{A) = {{GsGu, l){GsGu, 14), {su, 141)} 



therefore 
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1. It is completely reducible: Gq G a with {a, e) € Ao{A). 

2. literals p and f are 0-pure. 

When applying the corresponding substitutions we get 




G F 



V A 




s u s u 



This formula cannot be reduced any longer. By applying a branching rule^ 
we obtain 




G F 



V A 




s u 



It is easy to check that node 21 is Z\o-conclusive, by substituting this node 
by T we get T as a final result. Therefore the formula is unsatisfiable. 

5 Conclusions and Future Work 

We have introduced techniques for defining and manipulating lists of unitary im- 
plicants/implicates which can improve the performance of a given prover for tem- 
poral propositional logics by decreasing the size of the formulas to be branched. 
These strategies are interesting because can be used in any existing theorem 
prover, specially in non-clausal ones. 

As future work, the information in the Z\-lists can be increased by refining 
the process of generation of temporal implicants/implicates. In addition, current 
work on G-clauses and F-cubes appears to be a new source of reduction results. 

^ Every prover for linear-time temporal logic has such rules, in the example we use 
just one of those in the literature. 
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Abstract. One of the main characteristics of reasoning in knowledge 
based systems is its high computational complexity. Anytime deduction 
and anytime compilation are two attractive approaches that have been 
proposed for addressing such a difficulty. The first one offers a com- 
promise between the time complexity needed to compute approximate 
answers and the quality of these answers. The second one proposes a 
trade-off between the space complexity of the compiled theory and the 
number of possible answers it can efficiently process. The purpose of our 
study is to define a logic which handles these two approaches by incorpo- 
rating several major features. First, the logic is semantically founded on 
the notion of resource which captures both the accuracy and the cost of 
approximation. Second, a stepwise procedure is included for improving 
approximate answers. Third, both sound approximations and complete 
ones are covered. Fourth and finally, the reasoning task may be done 
off-line and compiled theories can be used for answering many queries. 



1 Introduction 

During these past decades, the problem of reasoning about commonsense knowl- 
edge has received a great deal of attention in the artificial intelligence commu- 
nity. A widely accepted framework for studying this issue is the knowledge based 
system approach [14]. Knowledge is described in some logical formalism, called 
the representation language, and stored in a knowledge base. This component 
is combined with a reasoning mechanism, which is used to determine whether 
a given sentence, assumed to capture the query, is entailed from the knowledge 
base. However, it is well known that deduction is very much demanding from a 
computational point of view. In particular, if the knowledge base and the query 
are represented in propositional logic, then checking whether the query is en- 
tailed from the knowledge base or not is a coNP-complete problem, that is, a 
problem which probably requires exponential time to be solved. 

Anytime computation is a technique which is used in many areas of artificial 
intelligence to deal with the computational intractability of problems (see [24] 
for a recent survey) . This paradigm extends the traditional notion of reasoning 
mechanism by allowing it to return many possible approximate answers to any 
given query. In the setting of propositional logic, two attractive approaches have 
recently appeared in the literature: anytime deduction and anytime compilation. 
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The goal of the first approach is to define a family of entailment relations 
that “approximate” classical entailment, by relaxing soundness or completeness 
of reasoning. The knowledge based system can provide partial solutions even if 
stopped prematurely; the accuracy of the solution improves with the time used 
in computing the solution. Hence, anytime deduction offers a compromise be- 
tween the time complexity needed to compute answers by means of approximate 
entailment relations and the quality of these answers. Following this idea, Dalai 
in [5] presents a general technique for approximating deduction problems. The 
starting point of its framework relies on the entailment relation defined 

using boolean constraint propagation. Based on this relation, the author defines 
a family of entailment relations which extend by allowing chain- 
ing on sentences of size k. Each relation is sound but incomplete with 

respect to classical entailment. A different method has been proposed by Cadoli 
and Schaerf in [2,17]. Their framework includes a parameter S, a set of atomic 
propositions, which captures the quality of approximation. Based on this param- 
eter, the authors define two families of entailment relations, named bf and b| , 
which are respectively unsound but complete and sound but incomplete with 
respect to classical entailment. Several extensions of this framework have been 
proposed in the domains of non-monotonic logics [3] diagnostic reasoning [22], 
and reasoning in presence of inconsistency [9,10]. 

The second approach is concerned by preprocessing a knowledge base into 
an appropriate data structure which is used for query answering. The goal here 
is to invest computational resources in the preprocessing effort which will later 
substantially speed up query answering, in the expectation that the cost of com- 
pilation will be amortized over many queries. Compilation is called “exact” if 
the data structure is logically equivalent to the initial knowledge base, thus 
guaranteeing answers to all possible queries (see e.g. [16,23]). However, in exact 
compilation it has been observed that the compiled knowledge base often occu- 
pies space exponential in the size of the initial source [19]. This undesirable effect 
has lead several researchers to explore the possibility of compiling the knowledge 
base into a family of data structures that “approximate” the initial knowledge 
base, giving up soundness or completeness of reasoning. The system attempts 
to compile a knowledge base exactly until a given resource limit is reached, and 
may answer queries before the completion of compilation. One can view anytime 
compilation as a technique which offers a trade-off between the space complexity 
of the compiled knowledge base and the number of queries that can be efficiently 
processed by this data structure. For example, several authors present anytime 
methods based on prime implicates generation which are sound but incomplete 
with respect to exact compilation [7,13,15]. Dually, Schrag in [18] proposes a 
prime implicants generation algorithm which is unsound but complete with re- 
spect to exact compilation. An analogous strategy has been proposed by Selman 
and Kautz in the context of “Horn approximation” for computing all the greatest 
lower bounds (GLB) of a clausal knowledge base [20]. 

The purpose of the paper is to introduce a unifying, logic oriented framework 
which captures the main ideas of these two approaches. Our investigation gen- 
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eralizes and expands in several directions previous results by Cadoli and Schaerf 
in [2,17]. The framework is based on multi-modal logic which contains a well- 
founded semantics and a correct and complete axiomatization. Moreover, the 
framework integrates the following major features : 

— The logic is semantically founded on the notion of resource which reflects 
both the accuracy and the computational cost of the approximations. 

— The framework enables incremental reasoning: the quality of approximations 
is a nondecreasing function of the resources that have been spent. Hence, 
approximate answers can be improved and may converge to the right answer. 

— The framework covers dual reasoning: both sound but incomplete and com- 
plete but unsound answers are returned at any step; they respectively cor- 
respond to the lower and upper bounds of the range of possible conclusions 
that approximate the right answer. 

— The framework allows off-line reasoning : the knowledge base can be com- 
piled and the resulting data structure may be used for efflciently processing 
a large set of queries. 

The formalism we propose is flexible enough to be applied to several anytime 
reasoning methods. In this study, we concentrate on the speciflcations of anytime 
deducers and anytime compilers which extend the traditional notion of knowl- 
edge based systems. Anytime deducers incorporate the first three properties of 
our framework ; they approximate the reasoning task by iteratively increasing 
their inference capabilities. Anytime compilers also exploit off-line reasoning by 
iteratively computing better and better approximations of their knowledge base. 

The rest of the paper is organized as follows. Section 2 formally defines the 
syntax, the semantics, and a sound and complete axiomatization for the logic. 
Sections 3 and 4 are devoted to the formal speciflcations of anytime deducers and 
anytime compilers. Finally, section 5 suggests some topics for future research. 
The proof of soundness and completeness of the logic is left in Appendix A. 

2 The Logic 

In this section, we present a propositional logic, named ARL, for anytime rea- 
soning. We insist on the fact that the logic is being used here as a specification 
tool to describe an anytime reasoner rather than as a calculus to be used by 
one. We begin to define the syntax of the logic, next we examine its semantics 
in detail, and then we present a sound and complete axiomatization for ARL. 



2.1 Syntax 

In this study, we consider a propositional language constructed from a finite set 
of atomic propositions (atoms for short) P. In order to formalize the reasoning 
capabilities of a knowledge based system, we model the notion of deductive 
inference as an exploration in a space of possibilities. In a propositional setting, 
this space is defined by the collection of all the interpretations defined from the 
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atoms of P. Following [17], the notion of resource is captured by a parameter S, 
a subset of P, which corresponds to a limited exploration in this space. 

The main contribution of this logic relies on two families of modalities Ds 
and Os, defined for each subset S of P. The operator 05 is to capture sound but 
incomplete inference and Os to capture complete but unsound inference. Based 
on these considerations, the language of ARL is defined by the smallest set of 
sentences built from the following rules: if p is an atom of P then p is a sentence, 
if a is a sentence, then -ia is a sentence, if a and (3 are sentences then a A (3 
and a V are sentences, and finally, if a is a sentence that does not contain any 
occurrence of the modalities Ds and Os, then Ds a and Os a are sentences. We 
remark that the syntax does not allow nested modal operators. A sentence such 
as □ s a is read “the system necessarily infers a given the resources S” ; dually 
Os a is read “the system possibly infers a given the resources S” . 

Other connectives D and = are defined in terms of A and V; that is, a D /3 
is an abbreviation of ^aV /3 and a = /? is an abbreviation of (a D /3) A (/3 D a). 
A declaration is a sentence without any occurrence of modalities Ds and Os, 
and a knowledge base A is a finite conjunction of declarations. When there is no 
risk of confusion, we shall model knowledge bases as sets of declarations. 

2.2 Semantics 

The basic building block of the semantics is a domain T of truth values which 
determines the interpretation of sentences and the properties of logical con- 
sequence. In the context of limited reasoning, the four valued semantics first 
proposed by Belnap [1] and Dunn [8], and notably studied in [12] meets our 
needs. It is a simple modification of classical interpretation in which sentences 
take as truth-values subsets of {0, 1}, instead simply either 0 or 1 alone. So, in 
the logic ARL, sentences can be valued to be true, false, both, or neither. 

Based on this structure, we define a valuation as a total function v form P to 
T. The space of valuations generated from P is denoted Vp. A possible world is 
a valuation which maps every atom p of P into {1} or {0}. The space of possible 
worlds generated from P is denoted Wp. We say that a valuation v is more 
specific than v' and write v C v' , if for any atom p G P, v{p) C u(p') holds. 

The concept of approximation is semantically represented by an equivalence 
relation between valuations. Given a resource parameter S, we say that two val- 
uations V and v' are S- equivalent and write v v' , if and only if for every atom 
p G P,ifp G S then v{p) = v'{p). It is easy to prove that is indeed a reflexive, 
symmetric and transitive relation. Intuitively, a relation of 5'-equivalence induces 
a partition of the set Vp into equivalence classes whose granularity captures the 
accuracy of approximation. When the resource parameter increases, the partition 
becomes “finer” and the approximation more precise. The “coarsest” partition 
is obtained when S is the empty set; in this case, is the total relation over 
Vp. Conversely, the “finest” partition is given when S is the set P; in this case 

is the identity relation over Vp. 

The figure 1 illustrates a space of valuations and a relation of 5'-equivalence 
defined for P = {p,q} and S = {p}. The nodes and the edges represent the 
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valuations and the induced inclusion relation defined from T. The sets of nodes 
connected by bold edges represent the equivalence classes. 




Fig. 1. A space of valuations and a relation of b'-equivalence 



With these notions in hand, we can now define the semantics of our logic. 
An interpretation of ARL consists of a truth support relation and a falsity 
support relation inductively defined by the following conditions: 



u p iff 1 G v{p), 
u ^0 P iff 0 G u(p), 

V 1=1 iff V 1=0 ex, 

V 1=0 = 0 ; iff V 1=1 a, 

V \=i a t\ fi V \=i a and v ^i ft, 

V \=o a t\ ft V \=o a or V \=o ft, 

u ^1 a V /? iff u ^1 a or u ^1 /?, 

V \=o ex V ft iS V \=o ex and v ^o /?> 

u ^1 ^5 a iff Vu' G Vp, if v v' then v' ^i a, 
u ^0 a iff u ^1 a, 

V h=i o: iff G Vp such that v v' and v' ^i a, 

V h=o a iff V ^1 Os a. 



( 1 ) 

( 2 ) 

(3) 

(4) 

(5) 



( 6 ) 




A Logic for Anytime Deduction and Anytime Compilation 329 



A sentence a is called satisfiable if and only if there exists a possible world 
w e yVp such that w a. We say that a sentence a is valid and write ^ a, 
if and only if, for every possible world w & W p , w \=i a holds. Finally, given 
two sentences a and /?, we say that /? is a logical consequence of a if and only if 
\= a j3 holds. The following lemma captures an important structural property 
of the support relations. It will be frequently used in the remaining sections. 

Lemma 1. For any declaration a and any pair of valuations v,v' such that 
V C v' , if V \=i a then v' a, and if v \=o ex then v' ct- 

Proof. Straightforward by induction on the structure of a. 



2.3 Axiomatization 

We now focus on obtaining a sound and complete axiomatization for our logic. 
An axiom system consists of a collection of axioms and inferences rules. A proof 
in an axiom system is a finite sequence of sentences, each of which is either an 
instance of an axiom or follows by an application of an inference rule. Finally, 
we say that a sentence a is a theorem of the axiom system and write h a if there 
exists a proof of a in the system. The axiom system of ARL is the following : 

Axioms: 



All tautologies of propositional logic. 


(Al) 


□ s = 050 


(A2) 


□ s {aA(3) = Ds {ft A a) 

□ s (a V /?) = Ds (/? V a) 


(A3) 


□ s {a A {(3 A 7)) = Ds {{a A /?) A 7) 

□ s (a V (/? V 7)) = Ds ((a V / 3 ) V 7) 


(A4) 


□ s (a A (/? V 7)) = Ds ((a A /?) V (a A 7)) 

□ s (a V (/3 A 7)) = Ds ((a V /?) A (a V 7)) 


(A5) 


□ 5 ^{a A ft) = ^5 (— 'O V “i/ 3 ) 

□ 5 ^{a V /?) = ^5 {^a A ^ft) 


(A6) 


□ 5a A Og ft = nig {a A ft) 

□ 5a V Ds /3 = Ds (a V / 3 ) 


(A7) 


° 5 u{p} (pV^p) 
(pA-ip) 


(AS) 


□ s a D a 


(A9) 


□ 50 = ~^Og —<a 


(AlO) 



Inference rule: 

From h a and h a D /3 infer h (3 (R-1) 
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We remark that the axiom (Al) and the inference rule (Rl) come from propo- 
sitional logic; hence the propositional subset of ARL is correctly handled. The 
other axioms capture the properties of the two families of modal operators. From 
an intuitive point of view, these axioms may be classified into two categories. 
The first one is concerned by the standards axioms (A2)-(A6) which capture the 
properties of double negation, commutativity, associativity, distributivity and 
DeMorgan’s laws. The specificity of the logic ARL lies in the second category. 
Axiom (A7) captures the conjunctive and disjunctive properties of operators 
□ s. Axiom (A8) is the key point of approximate reasoning. More precisely, if 
the parameter S is expanded by the atom p, then the system necessarily infers 
the tautology p V ~^p. Dually, if S is contracted by p, then the system can infer 
the antilogy p A ~^p. Axiom (A9), often called T, demonstrates that reasoning 
under the scope of the modality Ds is sound. Finally, axiom (AlO) captures the 
duality property between the modal operators Ds and Os- 

The following result gives soundness and completeness for the axiom system. 

Theorem 2. For every sentence a of the logic ARL, 

h a iff \= a. 

Proof. The proof is presented in Appendix A. 



3 Anytime Deducers 

After an excursion into the logic ARL, we now apply our results to the formal 
specifications of anytime deducers. In the context suggested by our approach, we 
define an anytime deducer as a knowledge based system that approximates the 
inference process by using an increasing sequence of resource parameters. Intu- 
itively, the more time the system has to evaluate the query, the more resources it 
can spend. However, anytime deducers are on-line reasoners; there is no notion 
of directly processing the original knowledge base to obtain an approximation 
to it. This last requirement will be incorporated in the next section. 

Following Levesque [11], we specify an anytime deducer as an “abstract type” 
that interacts with the user through a given set of service routines. To this end, 
we focus on two core operations, ASK and TELL, which allow a user to query the 
knowledge base and to add a new information to it. In the following definition, 
Lp denotes the set of all declarations of the logic ARL. 

Definition 3. An anytime deducer consists of an operation TELL from Lp x Lp 
to Lp, and an operation ASK from Lp x 2^ x Lp to {YES,N0, *} respectively 
defined as follows: 



TELL(A, a) 


= A Aa, 






{ YES, 


if h (A D a). 


ASK(A, S, a) 


= < NO, 


if [A Os (A D a), 




1 *. 


otherwise. 
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The basic difference with the standard approach to knowledge representation 
relies on ASK operation that explicitly includes the notion of resource in order to 
capture the inference capabilities of the system. 

Based on these considerations, the anytime deduction process is defined by 
an increasing sequence of resource parameters {So = 0 ■■■ C Si ■■■ C Sn — P) 
that approximate the set ^ = {a : ^ A D a}, by means of two dual families 
of sets = {a : |= {A D a)} and Af = {a : \= Osi {A D a)}. If we 
prove membership in any Af then we have proved membership in A. Dually, if 
we disprove membership in any Af then we have disproved membership in A. 
This stepwise process has the important advantage that the iteration may be 
stopped when a confirming answer is already obtained for a small index i. This 
yields a potentially drastic reduction of the computational costs. The following 
properties clarify the interest of our logic in the setting of anytime deduction. 

Theorem 4 (Monotonicity). For any declaration a and any resource param- 
eters S and S' such that S C S' , 

if then 1 = 05 / a, (1) 

if then '^Og,a. (2) 

Proof. Let us examine part (1). Assume that |= and ^ Ds/a. In the 

first case, for any possible world w and any valuation v such that w v, 
we have v a. In the second case, there exists a possible world w' and a 
valuation v' such that w' v' and v' a. Let us define a new valuation 

v" such that Vp G S, v"{p) = w'{p) and Vg ^ S, v"{q) = 0. It is clear that 
w' v" . Moreover, we have v" C v' . By application of lemma 1, it follows that 
v" a. Therefore, we obtain w' but this contradicts the hypothesis 

that 1= Ds a. A dual argument applies to part (2). 

Corollary 5 (Convergence). For any declaration a, 

if ^ a then there exists a resource parameter S such that |= 05 a, (1) 

if ^ q; then there exists a resource parameter S such that ^ <>s ct. (2) 

Theorem 6 (Duality). For any declaration a, 

I^Dsa iff Os -la is unsatisfiable, (1) 

^Osa iff Ds -la is satisfiable. (2) 

Proof. Let us examine part (1). Assume that nsQ: is valid and that Os “'Q; is 
satisfiable. By application of axioms (AlO) and (A2), it follows that ^Dsa is 
satisfiable. Therefore, there exists a possible world w such that w )=i ^Ds a. So, 
it follows that w |=o Lis a and w nsQ!, but this contradicts the hypothesis 
that Ds a is valid. Dual considerations hold for part (2). 
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Theorem 7 (Complexity). For any declaration a and any resource parameter 
S, there exists an algorithm for deciding whether 05 a is satisfiable and Os a is 
satisfiable which runs in 0{\a\ ■ 

Proof. We begin to show that in a ^-equivalence relation, at most one valuation 
of each equivalence class is needed for the satisfiability test. The sentence Ds a is 
satisfiable if there exists a possible world w such that w tUg a. Therefore, for 
any valuation v such that v w, we have v a. Let us define the valuation 
u_L such that Vp G S, v±{p) = w{p) and Vg ^ S, v±{q) = 0 . It is clear that 
w u_L. By application of lemma 1, if uj_ a then v \=\ a pour any v such 
that w V. Therefore, tc Dg a iff t'j. hi ci- Now we turn to the satisfiability 
test. For each valuation uj_, the truth value v± \=i a can be determined in 0(|a|) 
time. Since there exists valuations uj_, checking whether ^50 is satisfiable 
can be done in 0(|a| • 2l'^l ) time. A dual argument applies to the sentence Os a. 

Notice that the above complexity result is just the worst case upper bound 
of an enumeration algorithm. Although such an analysis is important, we do not 
claim the “brute force” method is the most feasible one. Actually, in the case 
of clausal theories, we can use a resolution based algorithm which computes the 
satisfiability of □ s, A and Os, A by means of an increasing sequence of parameters 
Si - For S'o = 0 , this respectively corresponds to checking whether A is the empty 
base and A contains the empty clause. For a given theory Ai which is not empty 
and does not contain the empty clause, the procedure starts to resolve all clauses 
in Ai upon the literals Pi+i and ^Pi+i , next eliminates all clauses in Ai containing 
these literals, and then checks whether the resulting theory is the empty base 
or contains the empty clause. It is interesting to remark that such an algorithm 
is indeed incremental; the procedure is able to exploit information gained in 
previous steps and does not require to perform all computations from scratch. 

The correct choice of S is crucial for the usefulness of deduction. Taking to 
the extreme, when S is chosen incorrectly, anytime deduction may end up as 
expensive as classical deduction. From this perspective, several heuristics have 
been proposed in the literature [6,9,10,22]. For example, in a resolution based 
algorithm, the atoms of S may be dynamically chosen using the minimal diversity 
heuristic advocated in [ 6 ]. The diversity of an atom p is the product of the 
number of positive occurrences by the number of negative occurrences of p in 
the theory. This notion is based on the observation that an atom can be resolved 
upon only when it appears both positively and negatively in different clauses. 
Hence, choosing an atom p whose diversity is minimal will minimize the number 
of resolvents that can be generated upon p. This heuristic may be augmented 
with others strategies such as boolean constraint propagation and the minimal 
width heuristic. These considerations are illustrated in the following example. 

Example 8. Suppose we are given A = |(-'aV 6 Vc), (oVfeV-'d), (aV^bVd), (-■oV 
-'bV c)}. We want to prove that A is satisfiable. Hence, we need to find a subset 
S of {a, b, c, d} such that Ds A is satisfiable. Starting with S = 0 and using the 
minimal diversity heuristic, we gradually add the atoms b and d to S. This is 
sufficient for proving that A is satisfiable. Now, we want to show that a D c is 
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entailed by A. Hence, we need to find a subset S such that Os (A A a A ~^c) is 
unsatisfiable. Using boolean constraint propagation, we iteratively add to S the 
atoms a, c and b. This is sufficient for proving that (A A a A ~^c) is unsatisfiable. 
Therefore, a D c is indeed a logical consequence of A. 



4 Anytime Compilers 

In this section, we extend the concepts developed so far to the formal specifi- 
cations of anytime compilers. Such systems can perform off-line reasoning: they 
approximate the original knowledge base by allowing a sequence of more and 
more powerful data structures. The quality of compilation depends on the com- 
putational resources that have been spent. However, the computational cost of 
compilation is amortized over a potentially very large set of queries and the 
resulting data structures can be used in processing each query. 

In the setting of knowledge compilation, it is well-known that every knowl- 
edge base has two specific normal forms, namely a conjunctive normal form 
(CNF) and a disjunctive normal form (DNF), from which queries can be effi- 
ciently answered. These normal forms are computed by means of the so-called 
prime implicates and prime implicants. An attractive property of this approach 
stems from the fact that the program used to generate the normal forms can 
be stopped before completion. More specifically, an interruptible process for 
generating prime implicates is sound but incomplete with respect to exact com- 
pilation. On the other hand, an interruptible prime implicant generation process 
is unsound but complete with respect to exact compilation. Based on these con- 
siderations, we present a method for generating prime implicates and prime 
implicants defined in terms of the logic ARL. 

To that end, we introduce some additional definitions. A literal is an atom 
or its negation, a clause is a finite conjunction of literals and a term is a finite 
disjunction of literals. When there is no risk of confusion, we shall model clauses 
and terms as sets of literals. A clause 7 is called a S-implicate of a knowledge 
base A, if ^ Ds (A D 7) and 7 does not contain two complementary literals. 
Dually, a term r is a S-implicant of a A, if ^ Ds (r D A) and r does not contain 
two complementary literals. A clause 7 is called a prime S-implicate of A, if 7 is 
a 5'-implicate of A and for every other 5'-implicate 7' of A, we have 7' ^ 7. In 
a similar way, a term r is called a prime S-implicant of A, if r is a 5'-implicant 
of A, and for every other 5'-implicant r' of A, we have r' ^ r. 

In the remaining paper, the conjunction of all the prime 5'-implicates of a 
knowledge base A is denoted PIC(A, S) and the disjunction of all the prime 
5'-implicants of A is denoted PID(A, S'). When clear from the context, such 
sentences will be respectively modeled as sets of clauses and sets of terms. Now, 
we have the formal tools for specifying anytime compilers. 

Definition 9. An anytime compiler consists of an operation TELL from Lp x 
2 ^ X Lp to Lp X Lp, and an operation ASK from Lp x Lp x Lp to {YES, NO, *} 
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respectively defined as follows: 

TELL(A, S, a) = (PIC{A A a, S), PID(A A a, S')) , 

r YES, if h ^PIC 3 a, 

ASK(ylpic,^PiD,a) = s NO, if ^ -4pio D a, 

I *, otherwise. 

The basic idea underlying the above definition is to invoke compilation during 
the TELL operation. Hence, the resulting data structures Hpic and Afpip may 
be used in the ASK operation in order to answer many queries. 

As for deduction, the compilation process may be modeled by an increasing 
sequence of parameters (Sq = 0 • • • C Si - ■ ■ C S„ = P) that approximate 
the deductive closure of A, denoted A, by means of two dual families of sets 
AfiC = {q, : ^ PIC(A,S*) D a} and A™ = {a : h PID(A,S*) D a}. For 
a given index i, if we prove membership in then we have also proved 

membership in A. On the other hand, if we disprove membership in then 

we have also disproved membership in A. The reader might wonder at this point 
if there exists a close relationship between the two previous families and 
Af generated during anytime deduction, and the two families A^^^ and Af 
defined during anytime compilation. In fact, as stated in the following theorem, 
there is a one to one correspondence between these families. 

Theorem 10 (Correspondence). For any knowledge base A, any declaration 
a, and any resource parameter S, 

h PIC(A, S)Da iff ^ Ds (A D a), (1) 

h PID(A, S') D a iff h Os (A D a). (2) 

Proof. We begin to introduce some useful definitions. We denote Vj the set of 
valuations v such that for every atom p G P, v{p) = {0} or v{p) = {1} if p € S, 
and v{p) = {0, 1} otherwise. Dually, Vg denotes the set of valuations v such that 
for every p G P, v{p) = {0} or v{p) = {1} if p G S, and v{p) = 0 otherwise. 

Let us examine part (1). A sufficient condition for proving (1) is to state that 
for any u G Vj, u )=i A iff u PIC(A, S). 

— Suppose that there is a w G Vj such that u A and v PIC(A, S). In this 

case, there must exist at least one clause 7 in PIC(A, S) such that v 7. 

Since 7 is a prime S'-implicate, the sentence Os (A A -,7) is unsatisfiable. So, 
either v -,7 holds or u A holds. In the first case, we would obtain 

V 7 V -,7, but this is impossible since from definition of v there must exist 
at least one possible world w C v such that w 7 V -,7 and by lemma 1, 
u ^1 7 V -,7. So, V Y=i A holds, hence contradiction. 

— Now, assume that there is a u G Vj such that u A and v PIC(A, S). 
Thus, for every prime S'-implicate 7 in PIC(A, S), we must have u 7. 
Suppose that A is unsatisfiable. Then it is easy to prove that PIC(A, S) is 
either empty or contains the empty clause. In both cases, it follows that 

V PIC(A, S), hence contradiction. Now, suppose that A is satisfiable. 
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Since 7 is a prime 5-implicate of A, then by theorem 4, it follows that 7 is a 
prime implicate of A. Therefore, for every possible world w such that w \=\ A 
there exists at least one literal I G 7 such that w 1. If for every Z G 7 , we 
have V \=i I, then there exists at least one world w C v such that w \=i I and 
w A. By lemma 1, it follows that v A, hence contradiction. If there 
exists a 7 ' C 7 such that v 7 ', then for every possible world w such that 
w A, we have w \=i ^ . Since Os {A A -ly') is unsatisfiable, then 7 ' is a 
5-implicate of A. Therefore 7 ^ PIC(A, 5), hence contradiction. 

We now turn to part (2). In a dual way, a sufficient condition for proving (2) is 
to show that for any w G V 5 , n A iff n PID(A, 5). 

— Suppose that there exists a w G V 5 such that v \=\ A and u PID(A, 5). 
If f A, then viewing u as a set of literals, we must have ^ u D A. 
Morevover, from definition of v, it follows that |= Ds (u D A). Hence, u is a 
5-implicant of A. It is clear that v cannot be a prime 5-implicant, because 
otherwise we would have v PID(A, 5). Therefore, there exists a term 
T G PID(A, 5) such that t C v and r A. However, by lemma 1, it follows 
that V A, hence contradiction. 

— Now, suppose that there exists a u G V 5 such that u A and v 
PID(A, 5). In this case, there exists a term r G PID(A, 5) such that v r, 
that is, T C V. Since Ds (r D A) is valid, we must also have r A. By 
lemma 1 it follows that v A, hence contradiction. 

The correspondence result above is very interesting because most of the prop- 
erties stated for anytime deduction also hold in the setting of anytime compila- 
tion. As an example, for any parameters 5 and S' such that 5 C 5', we can state 
that PIC(A, S') is as complete as PIC(A, 5) and that PID(A, S') is as sound as 
PID(A, 5). The next result is even stronger than monotonicity; we show that 
anytime compilation is incremental, using information gained in previous steps. 

Theorem 11 (Incrementality). For any knowledge base A, and any resource 
parameters S and S' such that S C S' , 

PIC(A,5) C PIC(A,5'), (1) 

PID(A,5) C PID(A,5'). (2) 

Proof. Let us demonstrate part (1). Suppose that 7 G PIC(A, 5) and 7 ^ 
PIC (A, S'). Since |= 05 (A D 7 ) holds, then by theorem 4, ^ 05 / (A D 7 ) also 
holds. Hence, there must exists a clause 7 ' such that 7 ' C 7 and |= 05 / (A D 7 '). 
However, in this case it is clear that |= n 5 (A D 7 ') holds. So 7 ^ PIC(A, 5), 
hence contradiction. An analogous strategy applies to part (2). 

The complexity result presented below clarifies the interest of the compilation 
process from a computational point of view. More precisely, we demonstrate that 
entailment of CNF queries can be computed in time polynomial of the size of 
resulting data structure plus the size of the query. 
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Theorem 12 (Complexity). For any knowledge base A, any resource param- 
eter S and any declaration a in CNF, there exists an algorithm for deciding 
whether PIC (A, S') D a is valid (resp. PID(A, S) D a is valid) which runs in 
0(|PIC(A,S)| + |a|) (resp. 0(|PID(A, S)| + \a\)). 

Proof. ^ PIC(A, S) D a holds iff every non tautological clause 7' of a there 
is a prime S-implicate 7 of PIC(A, S) such that 7 C 7'. This can be done in 
0(|PIC(A, S)| + |a|). On the other hand, ^ PID(7l, S) D a holds iff every clause 
7 has a non-empty intersection with every S-implicant r of PID(7l, S). This can 
be done in 0(|PID(.4, S)| -I- |a|). 

Clearly, the effectiveness of compilation for subsequent query processing de- 
pends on the size of its resulting data structures. In the setting suggested by 
our approach, a knowledge base can have at most 5'-implicates and 
b'-implicants. Using the results by Chandra and Markowsky in [4], the same 
knowledge base may have on average /|S'| prime 5'-implicates and /vT^ 
prime 5'-implicants. From this point of view, the interest of anytime compila- 
tion is that it has potential to greatly decrease the inconvenience of using exact 
compilation: since off-line reasoning may be too space demanding, it is clearly 
desirable to be able to process queries before completion. 

Several algorithms can be used to compute prime 5'-implicates and prime S- 
implicants. In the first case, we may conceive a stepwise procedure which starts 
by computing PIC (A, S) for S = 0 and that iteratively increases the value of S. 
For S = 0, this corresponds to checking whether the knowledge base A contains 
the empty clause or not. For a theory A which does not contain the empty clause, 
we check each possible implicate 7 in turn by adding its negation to the base and 
then testing Os (A A ^7) for satisfiability. If the sentence is refuted, then 7 is a 
5'-implicate of A. By allowing an increasing sequence of parameters S, we can 
notice already subsumed implicates and not count these in prime 5'-implicate 
generation. As far as PID(A, S') is concerned, dual considerations hold. 

As for anytime deduction, the correct choice of the parameter S is impor- 
tant for the usefulness of anytime compilation. This choice may heuristic; in this 
case, the atoms of S are iteratively selected to minimize the predicted number of 
generated prime S-implicates and prime S-implicants, using strategies suggested 
in [6,18,19]. Alternatively, the choice of S may be guided by query answering 
considerations. The letters selected during deduction are used in turn for compi- 
lation. In other words, the work done for one query is saved for use in answering 
the next query. These considerations are illustrated in the following example. 

Example 13. We are given A = {(a V 6 V c), (a V 6V -ic), (6 VdV -le), (bV ^dV e)}. 
Suppose, we want to generate at least one prime S'-implicate of A. Starting 
with S = 0 and using the minimal diversity heuristic, we gradually add to S 
the atoms b, a and c. This is sufficient to obtain PIC(A, S') = {a V b}. Now, 
suppose that the system is frequently asked CNF queries containing the atom 
b. In this case, we iteratively add to S the atoms b, d and e. Hence, we obtain 
PIC(A, S) = {6}. Notice that in both scenarios we have PID(A, S) = {6}. 
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5 Conclusion 

In this paper, we have dealt with the problem of reasoning in propositional 
knowledge bases focusing on two attractive approaches, namely, anytime deduc- 
tion and anytime compilation. The first one offers a compromise between the 
time complexity needed to compute approximate answers and the quality of 
these answers. The second one proposes a trade-off between the space complex- 
ity of the compiled knowledge base and the number of possible answers that 
can be efficiently processed by the resulting data structure. Our aim was to pro- 
vide a unifying, logic oriented framework which handles these two approaches 
and that enables us to specify anytime reasoners. We have stressed on a sound 
and complete multi-modal logic, named ARL, which generalizes and expands in 
several directions previous methods concerning approximate deduction [2,5,17] 
and anytime compilation [7,15,18]. Based on this logic, we have illustrated that 
the framework integrates several major features: resource-bounded reasoning, 
improvability, dual reasoning and off-line processing. 

We believe that the results reported here are interesting and worth of further 
investigations. We outline some of them. A first extension is concerned by the 
empirical analysis of the parameter S. In particular, it has been found in [21] that 
random “3-SAT” knowledge bases can be classified in three categories, namely 
under- constrained, over-constrained and critically- constrained, according to the 
ratio clauses-to-atoms. An important issue here is to examine the relationship 
between this ratio and the resource parameter S. This will give an estimation 
of the number of computational resources (i.e. atoms) required to perform de- 
duction and compilation tasks in each category of knowledge bases. A second 
extension is to study anytime reasoning in the setting of pseudo- first- order logic, 
which has received a great deal of interest in database theory. These represen- 
tation languages are defined from a finite domain of discourse without function 
symbols. However, although every first order knowledge base can be replaced by 
an equivalent propositional theory, the size of the theory may be exponentially 
larger than the initial base. Hence, formal extensions of our logic should be done 
in order to control such a source of complexity. A third possible extension is 
to consider anytime reasoning in a nonmonotonic setting. As an example, in a 
multi-agent system, a reasoner is often confronted with uncertain and incon- 
sistent information [9,10]. In such circumstances, it is necessary to incorporate 
conflict resolution methods and preference orderings which involve additional 
sources of complexity. Important issues such as anytime nonmonotonic reason- 
ing and anytime recompilation should play a key role in this setting. 
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Appendix A. Proof of Soundness and Completeness 

The following results give soundness and completeness for the axiom system. 

Theorem 14 (Soundness). For every sentence a of the logic ARL, 

if h a then \= a. 

Proof. It is easy to see that axioms (A1)-(A6) are sound, and that the inference 

rule (Rl) preserves validity. Let us examine the other axioms. 

— Axiom (A7). The only nontrivial case is the soundness of ^ 5(0 V /?) D 
□ 5 a V Elg/3. Suppose that this sentence is not valid. Then there exists a 
possible world w such that w |=i Ds {a V (3) and w 05 a V Ds /?. In the 
first case, for every valuation v such that v w, we have u aV/3. In the 
second case, there exists two valuations v',v” such that v' w, v" w, 
v' CK; and v" P- Let us define a new valuation u, such that u{p) = w{p) 
for every p G S, and u{q) = 0 for every g ^ S'. It is clear that w, v, v' , v" 
and u belong to the same equivalence class defined for ~ 5 . Moreover, u C v' 
and u Cv” . By lemma 1, we obtain u a and u^i (3. So w a V /3 and 
therefore w 05 (a V /?), hence contradiction. 

— Axiom (A8). Let us examine the first sentence. Suppose that nsufp} (P V -'p) 
is not valid. Then there must exist a possible world w such that w 

{p'^ ^p)- In this case, there exists a valuation v such that w ~5u{p} f 
and u p V ->p. However, since w is a. possible world, we have 1 G w{p) or 
0 G w{p). Moreover, since p G (S U {p}) it follows that 1 G v{p) or 0 G v{p). 
So V p V ^p, hence contradiction. We now turn to the second sentence. 
Os-{p} (p A -ip) is valid iff for every possible world w there exists a valuation 
V such that w v and v \=i p A ~^p. Let us define a new valuation 

v' such that v'{p) = {0, 1} and for every g G (S' — {p}), v'{q) = w{q). It 
is clear that v' p A —ip. Moreover, for every possible world w, we have 

w ~s_{p} V. Therefore, 05 _{pj(p A ~^p) is valid. 

— Axiom (A9). Suppose that Ds a D a is not valid. Then there exists a possible 

world w such that w (=1 and w a. If tc (=1 Lisa then for every 

valuation v such that w ~s v, we have v (=1 a. Since ~s is reflexive, it 
follows that w \=i a, hence contradiction. 
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— Axiom (AlO). Let us examine the first implication (if). Suppose that 05 a D 
^Os “'Q; is not valid. Then there exists a possible world w such that w 
□ sa and w -> 05 -la. In the first case, for any valuation v such that 
V w, we have v )=i a, while in the second case there exists a valuation 
v' such that v' w and v' (=0 a. It follows that u' a A ^a. Let us 
define a new valuation u" such that for every p G S, v"{p) = v'{p) if p G S', 
and for every q ^ S, if v'{q) = 0 then v"{q) = {0, 1} and if v'\q) = {0, 1} 
then v”{q) = 0. It is easy to show by induction on the structure of a that 
if v' \=i a ~^a then v" a V ^a. Moreover, it is clear that v' v" . 
Therefore, it follows that w Dso;, hence contradiction. An analogous 
strategy applies to the second implication (only if). 



Theorem 15 (Completeness). For every sentence a of the logic ARL, 

if (= a then h a. 

Proof. We divide the proof into three steps. In the first step, we introduce the 
notion of maximal consistent set and we examine several properties of these 
structures. In the second step, we show that every sentence of the language 
has two important normal forms, namely an extended conjunctive normal form 
(ECNF) and an extended disjunctive normal form (EDNF) which are useful for 
the satisfiability test. Finally, in the third step, we prove that every sentence in 
a maximal consistent set is satisfiable. We conlude the proof by showing that 
this result is sufficient to state completeness. 

We start the first step by to giving some definitions. A sentence a is consistent 
if its negation is not theorem (i.e. 1/ ~^a). A finite set of sentences is consistent 
exactly if the conjunction of its sentences is consistent, and an infinite set of 
sentences is consistent exactly if all of its finite subsets are consistent. A sentence 
or a set of sentences is said to be inconsistent exactly if it is not consistent. 
Finally, a set A of sentences is called maximal consistent if it is consistent and 
for any sentence a of ARL, if a ^ A then A U {a} is inconsistent. 

Lemma 16. In the logic ARL, Every consistent set of sentences can he ex- 
tended to a maximal consistent set of sentences. In addition, if A is a maximal 
consistent set, then it satisfies the following properties: 



for every sentence a, exactly one of a and is in A, (1) 

a/\j3&A iff a&A and (3 & A, (2) 

aV j3 a A iff a A or j3 a A, (3) 

if\-a then a G A, (4) 

if a G A and h a D /3 then p G A. (5) 



Proof. Straightforward, by using standard techniques of propositional logic. 

We now turn to the second step. We first recall that a literal is an atom or 
its negation, a clause is a finite conjunction of literals and a term is a finite 
disjunction of literals. A declaration in conjunctive normal form (CNF) is a 
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conjunction of clauses and a declaration in disjunctive normal form (DNF) is a 
disjunction of terms. A sentence a is said to be in extended conjunctive normal 
form (ECNF) if, treating subformulas of the form Og /3 and Og (3 as atoms, a is 
in CNF, and for each subformula Ds /3 and Og (3 oi a, (3 is in CNF. The notion 
of extended disjunctive normal form (EDNF) is defined in a dual way. 

Lemma 17. For every sentence a of the logic ARL, 

\- a = ctECNF and \~ a = Oednf- 

Proof. By repeated applications of axioms (A2)-(A6) and rule (Rl). 

Now we have proved the two preparatory lemmas, we turn to the third step. 
We demonstrate that every sentence in a maximal consistent set is satisfiable in 
the semantics of our logic. This property may be formalized as follows. 

Lemma 18. If A is maximal consistent set of sentences of ARL, then there 
exists a possible world wa € Wp such that: 

Va e ARL, a e A iff wa Hi 

Proof. Let wa be defined in the following way: for every atom p G P, wa \=i P 
iff p G A. We prove the lemma by induction on the structure of a. If a is an atom 
p, this is immediate from the definition of wa. The cases where a is a negation, 
a conjunction or a disjunction follow easily from parts (1), (2) and (3) of lemma 
16. In the proof below, we concentrate on the case where a is of the form Og /3. 
The final case where a is of the form Og (3 directly follows from axiom (AlO). 

— (if direction). Suppose that Og (3 G A and that wa Hi (i- From the first 

assumption and by lemma 17, we have Ds /3 ecnf € A. So, by axiom (A7) and 
part (2) of lemma 16, for every clause j of /3 ecnF) we must have Og^ g A. 
Moreover, by axiom (A7) and part (3) of lemma 16, there exists at least one 
literal I of 7 such that Ogl g A. From the second assumption, we obtain 
Wa Hi i^s/Secnf- Therefore, there must exist a clause 7' of /3 ecnf, such 
that for every V G 7', we have wa Hi ■ If ^ then wa Hi by 

the induction hypothesis we have I' ^ A. By axiom (A9) we obtain Ogl' ^ A, 
hence contradiction. If I' ^ S, by axiom (A8) we obtain Og {I' A ~^l') G A, 
and by axiom (AlO) we have —•{I' A —•I') G A. By application of axioms 
(A2),(A3) and (A6), it follows that ^D5(r V ^l') G A. From part (1) of 
lemma 16 we first obtain Og (^l' \/ -^V) ^ A and by axiom (A7) it follows that 
Og I' v Og -iZ' ^ A. Finally, from part (3) of lemma 16, we obtain Ogl' ^ A, 
hence contradiction. 

— (only if direction). Suppose that wa Hi and that Og (3 ^ A. From 

the first assumption, we obtain wa Hi I^S/Secnf- So, for every clause 7 of 
/SecnF) there exists a literal I such that its atom is an element of S and 
Wa Hi However, by axiom (A8) we have Ds {I V ^l) G A. By axiom (A7), 
it follows that Ogly Dg^l g A, and from part (3) of lemma 17 we have 
Ogl G A or G A. Since wa Hi h by the induction hypothesis we 

have Z G A. By axiom (A9) we must obtain Og^l ^ A, because otherwise 
we would have —^l G A. Therefore we have Ogl g A, and by axiom (A7) it 
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follows that ns 7 G A. A similar method applies to each clause of /3ecnf- 
Therefore, by axiom (A7), it follows that ns/SECNF G dl. Finally, by lemma 
17 we obtain 05 /? G A, hence contradiction. 

Now, we have all the results in hand for concluding the proof. By lemma 
16, we know that if a sentence is consistent, then it is a member of a maximal 
consistent set. But in lemma 18, we have shown that if a sentence belongs to such 
a set, then it is satisfiable. So, we have proved that every consistent sentence is 
satisfiable. This is sufficient to conclude that every valid sentence is provable. 
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Abstract. A powerful syntactic theory as well as expressive modal log- 
ics have to deal with self-referentiality. Self-referentiality and paradoxes 
seem to be close neighbours and depending on the logical system, they 
have devastating consequences, since they introduce contradictions and 
trivialise the logical system. There is a large amount of different at- 
tempts to tackle these problems. Some of them are compared in this 
paper, futhermore a simple approach based on a three- valued logic is ad- 
vocated. In this approach paradoxes may occur and are treated formally. 
However, it is necessary to be very careful, otherwise a system built on 
such an attempt trivialises as well. In order to be able to formally deal 
with such a system, the reason for self-referential paradoxes is studied in 
more detail and a semantical condition on the connectives is given such 
that paradoxes are excluded. 



Keywords: Knowledge representation, self-referentiality, paradoxe, Kleene logic 

Ich habe manche Zeit damit verloren, 

Denn ein vollkommner Widerspruch 

Bleibt gleich geheimnisvoll fiir Kluge wie fiir Toren. ... 

Gewohnlich glaubt der Mensch, wenn er nur Worte hort, 

Es miisse sich dabei doch auch was denken lassen. 

Johann Wolfgang von Goethe, Faust I 



1 Introduction 

The symbolic representation of knowledge and belief plays a crucial role in arti- 
ficial intelligence, in particular in multi-agent systems. For their representation 
two principally different formal systems, modal logic and meta-systems have 
been developed. Both of them come with a detailed ramification of realisations 
depending on what is actually adequate and what is not. 

The predominant formalism for representing knowledge and belief in agent 
systems is based on modal logics (there is a vast amount of literature on modal 
logic. I just point to Fitting’s general introduction, where further pointers can be 
found, see [4, p.365~448]). However, meta-systems have some crucial advantages 
over modal logic, although some problems as well. As Davis points out in [2, p.77] 
the difference between the two approaches is pretty much the same as the differ- 
ence between direct quotation (“John said, ‘I am hungry.’”), corresponding to 
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meta-systems, and indirect quotation (“John said that he was hungry.”), corre- 
sponding to modal logic. Davis stresses that direct speech is more powerful, since 
in particular non-syntactic expressions can be expressed directly (“John said ‘Ar- 
glebargle glumph’”), but not indirectly. A similar property holds for modal logic 
versus meta-logic. In modal logic an agent knows/believes all tautologies and as 
well as the deductive closure of all its knowledge/beliefs. In meta-systems, how- 
ever, it is easily possible to do away with this unrealistic property. On the other 
hand, meta-systems allow paradoxical sentences such as the liar sentence (“This 
sentence is false”) to be represented. This seems to be a crucial disadvantage. 
However, powerful modal logics are faced with the same problem [15]. 

In this paper, we shall see in more detail, why paradoxes occur in meta- 
systems and what different ways out of the problems exist. Essentially I shall 
advocate a system, in which paradoxes are taken seriously and formulae like the 
liar sentence are admitted. They are neither true nor false, but paradoxical. In 
order to do so, a third truth value is added. As we shall see, just adding such a 
truth value does not solve the problem, but the system has to fulfill an additional 
constraint. 



2 Meta-Logic and Syntactic Theory 

A reflective meta-systems is a system in which it is possible to speak about 
the system itself. Frege’s original system (see [18]) was of that kind. Russell’s 
discovery in the beginning of the 20th century that self-referentiality, in form 
of the “set of all sets that do not contain themselves”, leads to paradoxes, 
jeopardised the whole enterprise of formalising logic. After a few years Rus- 
sell himself came up with a solution - type theory ~ that bypasses the unwanted 
phenomena by syntactically forbidding them [16]. This approach is generally 
accepted in mathematics (and adequate for mathematics). Similar paradoxes 
occur in syntactic theory as well (see Grelling and Nelson [8]). If we define, 
for instance, a predicate long, then there is a difference in stating long(John) 
and long(“John”), the first expression refers to a person called John, the sec- 
ond to a string, consisting of four letters. We can define now a new predicate 
heterogeneous as VaJi heterogeneous (“a;”) ^ -ia;(“a:”). For instance, if we define 
long on strings that a string has to have at least eight characters in order to be 
long, we have heterogeneous (“long”), since ^long(“long”) holds. The paradox oc- 
curs when we check, whether heterogeneous is heterogeneous or not. By definition 
we get: heterogeneous (“heterogeneous”) ^ ^heterogeneous(“heterogeneous”). 

Of course, we can argue whether the definition of heterogeneous forms really 
a proper definition, since it has the form Vxi heterogeneous (“a;”) ^ ... rather 
than Va:i heterogeneous (a;) <->.... Excluding this extended form of definitions 
(and excluding expressions of the form a:(a:) by some kind of types) does away 
with self-referentiality and the paradoxes disappear. It is still possible to define 
predicates such as long (e.g., as Va:.long(a;) ^ length (a:) > 8), but not to define 
problematic predicates like heterogeneous. 
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However, while the predicate heterogeneous is not crucial for a meta-system, 
the True predicate is, and this is excluded as well. Tarski [17] gave a famous 
definition of truth: 

True (“A”) A 

as axiom schema for arbitrary formulae A. While this definition seems to be 
obvious and to form a minimal requirement to relate strings and the objects 
they stand for, it already is a source of problems: The famous liar sentence “This 
sentence is false” can be expressed in syntactic theory as L := ^True(“L”). Note 
that there are no quotes around the definiendum, hence unlike the definition of 
heterogeneous, the definition of L is much more of the form we would accept as 
a proper definition. With Tarski’s definition of truth, this definition unfolds to 
L ^True(“L”) <-!■ ^L, that is, L <-> ^L. This expression is inconsistent with 
interpreting L as true as well as interpreting it as false. There are different ways 
out of this fundamental problem. Three of them are discussed in more detail in 
Section 4. First we shall take a closer look at the formalisation of the logic. 



3 The Language 



The language we want to investigate is a standard first-order language plus 
strings as terms. We assume variables (usually denoted by x, y, and z), connec- 
tives -I, A, V, and the quantifiers V and 3 as well as auxiliary symbols ), ( 
and , as usually and a signature that consists of a set predicate symbols (denoted 
by P, Q, and R, e.g.) and a set of function symbols (denoted by /, g, and h, 
e.g.), all with arities. Nullary function symbols are called constants. Terms are 
recursively defined as variables, constants or the application of n-ary function 
symbols to n terms. Formulae are either atomic formulae built by the application 
of an n-ary predicate symbol to n terms or formulae composed by connectives 
and quantifiers. 

In addition to this standard setting, we use strings in the language. Strings 
are tuples of characters. Following the description in [2] we prefix characters by a 
colon, e.g., :C is the character C, the string “Cat” is syntactic sugar for tuple(:C, 
:a, :t). It is possible to define syntactic predicate symbols such as is.term and 
isTormula and syntactic function symbols such as cone for concatenation. Details 
of such a construction can be found in [5, Chapter 10]. In addition there are se- 
mantic predicates such as True. We shall discuss this predicate symbol in detail 
in the sequel. Self-referentiality occurs in this language, since it is possible to 
formally define the liar sentence as L := -iTrue(“L”). Strings as “L” are consid- 
ered as ordinary ground terms in the language. They do not cause any problem 
as long as we do not look into them and relate them to the objects they stand 
for. 
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4 Ways Out? 

In this section we distinguish three different ways how to deal with self-referential 
paradoxes: Firstly, forbid self-referentiality, secondly abandon or weaken the def- 
inition of truth, and thirdly change the underlying logical system. 

First approach 

This approach corresponds pretty much to Russell’s solution of the problem, 
namely to syntactically exclude any self-referentiality, may it be paradoxical or 
not. Russell introduced types such that in any expression of the form P{Q), P 
has to have a higher type than Q. In particular P can’t express anything about 
itself or about anything that can make statements about P. This approach has 
been adapted by Tarski to meta-systems. It is formalised and implemented in 
the multi-level meta-systems FOL [19] and Getfol [7]^. These systems consist 
of different instances of first-order logic, an object system (standard classical 
first-order logic) and a meta-system in which the terms stand for syntactical ex- 
pressions (like terms and formulae) of the object systems^. The relation between 
the meta-level and the object level is established by so-called bridge rules. They 
state, whenever in the meta-system True(“A”) has been derived, it is possible 
to derive A on the object level by a rule called reflection-down, and vice versa, 
when A or ^A is derived on the object level, it is possible to assume Tme(“A”) 
or ^True(“A”) on the meta-level. 

Let us exemplify this in showing how a simple meta-theorem, namely 
VA.True(A) ^ ^True(conc(“-i”, A)) 
can be proved in such a system. 

cone stands for concatenation of strings. We use O to indicate the object level 
and M for the meta-level. 



Proof: Ml 1 h True(“Ao”) (Ass) 

O 2 1 h Ao {TZdn 1) 

O 3 1 h ^^Ao (-- 2) 

M4 1 h ^True(“^Ao”) {^TZup 3) 

M5 h True(“Ao”) ^ ^True(“^Ao”) H I 1.4) 



M6 h VA.True(A) ^ ^True(conc(“^”, A)) (V/ 5) 

That is, from the assumption True(“Ao”) on the meta-level, it is possible to 
reflect down to Ao on the object level. ^^Aq is derived on the object level 
and reflected up to the meta-level formula ->True(“^Ao”). The rest is standard 
reasoning on the meta-level, that is, implication introduction (from lines 1 and 
4) and forall introduction (from line 5). 

^ An alternative to totally forbidding self-reference is restricting it, e.g., restrict ex- 
pression like {a; I a; ^ a;} to {a; € A I a; ^ x}. This approach is discussed in detail 
in [Ij. 

^ Actually the systems are extended to allow different meta-levels, in particular a 
meta-meta-level and so on. 
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Self-referentiality is not possible, the meta-system speaks about the object 
system, but never about itself. Tarski’s definition of truth is built in the refiection 
rules (7?.„p,72.dn,and -•'R-up)- The liar sentence as such is not representable in 
that system, useful sentences are, however, excluded as well. It is not possible 
to extend the system in a way that two agents can make utterances about each 
other. For instance, A says “Everything B says is right.” and B says “Everything 
A says is right.” is not expressible. If you allow extensions of this kind, you 
introduce the paradoxes again in sentences such as: A says “Everything B says 
is wrong.” and B says “Everything A says is right.” 

In order to show that such situations are not artificial Kripke [13] gave an 
interesting example that did actually occur at the time of the Watergate affair: 

— Nixon says “Everything Jones says about Watergate is true.” 

— Jones says “Most of Nixon’s assertions about Watergate are false.” 

You get a paradox when you assume Nixon made 2n -I- 1 assertions about Wa- 
tergate, n of those are definitively true, n are definitively false, plus the one 
above. 

Second approach 

In the second approach Tarski’s definition of truth is changed. If you give it 
fully up, the liar sentence does not produce any problem any more. However, 
this solution is far too radical, since in such a case meta-level and object level 
co-existed without any relation between them. Perlis defined for that reason a 
weakened form of the definition of truth [14], True(“A”) ^ (A)*, where the 
^-operator replaces each connective occurrence of the form -iTrue(“. . .”) in A 
by True(“^(. . .)”). Using Kripke’s fixpoint semantics he can build a system in 
which the liar sentence, L := ^True(“L”), is not paradoxical. However, since L is 
the liar sentence, True(“L”) can’t hold. Hence in the system, L and -iTrue(“L”) 
have to hold at the same time. In particular L is true, this certainly interprets 
the liars sentence in a non-intuitive way: intuitively, the sentence can be neither 
true nor false. In Perlis’ formal system it is true, however. 

Compared to the second approach, a general formula like VA.True(A) ^ 
^True(conc(“-'” , A)) cannot so easily be proved anymore, but requires an in- 
ductive argument on the construction of the formulae A. Assume, for instance, 
A := ^True(“P”) with P a nullary predicate constant. The corresponding in- 
stance of the formula above is: 

True( “^True( “P” )” ) -> ^True( ‘U^True( “P” )” ) . With the new definition of truth 
and elimination of double negation, this formula can be rewritten to: 
True(“^P”) ^ ^True(“P”), a formula which reduces to ^P ^ ^P in turn. 

As in the first approach, the liar sentence does not cause any formal prob- 
lem anymore in this approach, although the solution is not intuitive. General 
problems about mutually related statements about each others beliefs (A says 
“Everything B says is wrong.” and B says “Everything A says is right.”) persist, 
however. By the way, as Perlis showed [15], this problem is not specific to meta- 
systems, but occurs in modal logics as well (unless there are serious restrictions 
on the expressive power of the modal logic). 
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Third approach 

Kripke [13] and others attacked the problem by changing the semantic con- 
struction of the truth values of formulae. They go beyond two- valued logic in 
a way that they assume a third truth value, which stands for the paradoxical 
expressions. There are different ways, either to assume truth value gaps [3] or 
a third truth value, which states undefinedness explicitly [12, p. 332-340]. In 
both systems, Tarski’s original definition of truth can be incorporated, that is, 
True(“A”) = A and the liar sentence^, L := ^True(“L”) can be dealt with as 
well. 

In the first system with truth value gaps, no truth value is assigned to the 
liar sentence L. In the second, L is evaluated to the truth value undef, in par- 
ticular True(“L”) is undefined as well. The meta-theorem above VA.True(A) — > 
^True(conc(“-'”, A) does not hold anymore, since with the instance “L”, we get 
True(“L”) ^ ->True(“^L”). This expression reduces to L ^ and this in 
turn to L L. The last expression does not have any truth value in the first 
system and the truth value undefined in the second. 

The question arises how to determine the truth values in such expressions. 
Kripke proposes to this end a fixpoint iteration, a process in which in each 
iteration truth values are assigned to more and more formulae. In the following I 
shall present an approach that is based on a three- valued logic as well, but does 
away with the complicated fix-point construction and is based on a standard 
calculus for three- valued Kleene logic as developed by Kohlhase and myself, for 
instance, see [10]. 

5 Kleene Logic — a Simple System for Dealing with 
Paradoxes 

Let us now assume the following two unary connectives {not) and D {defined) 
in the system with the following truth tables 



— 1 


D 




false 


true 


false 


true 


undef 


undef 


undef 


false 


true 


false 


true 


true 



and the binary connectives V (or) and = {strongly equivalent) 



V 


false 


undef 


true 


= 


false 


undef 


true 


false 


false 


undef 


true 


false 


true 


false 


false 


undef 


undef 


undef 


true 


undef 


false 


true 


false 


true 


true 


true 


true 


true 


false 


false 


true 



® We use = to denote strong equivalence, which allows for full substitutivity. That is, 
A = B is true if and only if the two truth value for A and B agree; it is false else, 
cp. the truth table on this page. 
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Other connectives as A, and can be defined in V and just as in 
classical two-valued logic, e.g., A A B := V ^B), and the corresponding 

truth tables can be calculated from these definitions. 

When we look at the sentence “This sentence is undefined or false” , we can 
define it in this system as F := ^DTrue(“F”) V ^True(“F”). With the definition 
of truth we can simplify the expression to: 

F = ^DF V ^F. If we look at the possible truth values for F, that is, false, 
undef, and true, we get the following evaluation, which shows that F cannot be 
consistently assigned to any of the three truth values that are available: 



F 


= ^ D F 


V 


^ F 


false 


false false true false 


true 


true false 


undef 


false true false undef 


true 


undef undef 


true 


false false true true 


false 


false true 



In the same line, the expression 

G := (True(“G”) = ^True(“G”)) = True(“G”) 
causes the same problem. 

That means, the malign paradoxes we tried to ban by the third truth value 
are back. As Davis says [2, p.85] : ‘^Having a fundamental Haw like this in a logic is 
worrisome, like carrying a loaded grenade; you never know when it might go off.” 
On the other hand to exclude self-referentiality altogether seems to drastically 
limit the expressive power of a system. As Kripke states [13, p.698, footnote 
13]: “. . . some writers still seem to think that some kind of general ban on self- 
reference is helpful in treating semantic paradoxes. In the case of self-referential 
sentences, such a position seems to me to be hopeless.” 

Hence there arises the question: What is the very reason for the presence of 
paradoxes in the first place and how can we get rid of them? The next section 
is devoted to this question. 



How to get and how to avoid self-referential paradoxes 

The source of any paradoxes of self-referentiality relies on the fact that it is 
possible to define a formula to which no truth value can be assigned. This can 
only be done if it is possible to define a function in the connectives that is 
fixpoint free on the truth values. In a two- valued system such a function is easily 
defined, it is just negation, that is, as soon as negation is part of a self-referential 
system, paradoxes are not far away. In our setting above Xxm^Dx V ~^x and 
Xx.{x = ~^x) = X are of that type as well. If we apply one of these functions to 
any of the three truth values false, undef, or true, the result will never be the same 
as the input of the function. From this the paradoxes are easily constructed. 

However, if we restrict the language to a system, in which each function that 
can be built from the connectives has at least one fixpoint (conveniently, undef 
would play that role), it is not possible to construct self-referential paradoxes. 
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As the two counterexamples above show, in a three-valued setting, no lan- 
guage that contains the connectives -i, V, and D and no language with unre- 
stricted use of ^ and = guarantees the existence of fixpoints. However, since 
has undef as fixpoint, XxmXV x has fixpoint undef as well, and since the fixpoint 
property is conserved under composition, we get: 

Theorem 1. Any function built from the connectives -> and V contains the fix- 
point undef. 

Since the connectives A, — and can be defined in ^ and V the property 
holds of course, when we add these connectives to the system. 

Corollary 2. Self -referential paradoxes are not constructible in the three-valued 
system described above that contains just the connectives V, A, — and 

Note that even in the case of complicated expressions in which different 
agents make assertion with mutual relations, the fixpoint property allows us to 
consider everything as undefined. Take, for example, the situation in which agent 
A says “What B says is wrong.” and B says “What A says is right.” This can be 
formalised as: 

A := ->True(“B”) and B := True(“A”). With the definition of truth, A reduces 
to ^B and B to A, the first expression states that A holds if and only if B does 
not, while the second states that A holds if and only if B does too. Both can’t 
be consistently hold with the truth values true and false. If however, we assert 
undef to A and B, the problem is gone, since undef is a fixpoint of 

Note furthermore that the system with truth value gaps and the three- valued 
system differ. In both systems a true expression like ^long(“long”) evaluates to 
the truth value true. However, if we add a paradoxical disjunct like the liar 
sentence, in the first system the truth value gap is contagious and the whole 
expression ^long(“long”) V L does not get a truth value. In the second approach 
the expression is evaluated to true V undef, that is, to true. 

In order to be able to introduce definitions (like the definition of the liar 
sentence, but of course of useful sentences as well), the = connective cannot be 
abandoned altogether, but may occur in definitions, that is, in the form A = . . . 
or Va;iA(a:) = . . ., where A is a predicate constant. 

Not all self-referential sentences are paradoxical. For instance, “This sentence 
is true” is self-referential, formally, if we call the sentence T, it can be expressed 
as True(T). T := True(“T”). All three truth values, true, false, and undefined, 
can consistently be assigned to this sentence. 

Note that due to the restriction on the connectives, the system does not 
contain many tautologies (just the formulae True(“A”) = A are tautological). 
This shouldn’t be seen as a disadvantage, since under normal circumstances, we 
do not want to communicate tautologies, but substantial facts and reason then 
about consequences of them. In order to do so we enlarge the set of axioms by 
further assumptions and use these for deriving further formulae. 
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6 Semantics and Calculus 

The question is how can we restore a non-trivial system when there are no 
proper tautologies in it. This is achieved by considering sequents, that is, pairs 
consisting of a set of formulae and a formula A (written <f> h A), in which all 
formulae in <f> are assumed to be defined. The semantic equivalent is the model 
relation |= A that stands for: in all models of <h, that is, all interpretations 
that evaluate every formula in to true, A holds as well, that is, A is evaluated 
to true as well (for details of the semantics see [10]). A proof theory that is built 
on the refutation principle makes use of the fact that all formulae in d) must be 
true and refutes the assumptions that A might be false or undefined. Note the 
asymmetry between <f> and A in this approach, formulae in can’t be undefined, 
while A might be.^ 

In the following I give a short account of the proof theory of the system 
described above. In order to exemplify the approach a natural deduction system 
is introduced. It is based on restricted three-valued Kleene logic and has the 
advantage that it has not to make use of fixpoint iterations a la Kripke for 
self-referential expressions. 

Since our setting is three-valued, the law of the excluded middle does not 
hold anymore. In particular paradoxical sentences are neither true nor false. 

The rules roughly correspond to Gentzen’s classical rules of natural deduc- 
tion [6] . The main difference is that the classical introduction rule of negation, 
^l-rule 

[A] 

— ^1 

^A 



does not hold in the three- valued system. The liar sentence is a counter-example 
to this rule. In order to conclude from a contradiction under assumption A to the 
negation, i.e. ^A, it is necessary to know that A is not paradoxical. This can be 
ensured by showing that AV^A is true . The ^l-rule is adapted correspondingly. 
Remember the restricted use of =. We use this connective for definitions only. 
Hence we have an elimination rule (“Subst”), but no introduction rule (= is to 
be read commutatively). By C[A] we denote a formula with subformula A, C[B] 
is the formula in which A is replaced by B. 

A natural deduction calculus consists of the axiom schema: 

Assumption: A h A 

and the rules: 



^ In the full logic with the D connective, this is reflected in the changed form of the 
deduction theorem: $ U {A} |= B iff $ |= A A DA — ^ B. The deduction theorem is 
not expressible in our restricted system, since we do not have the D connective. 
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_1 

A 



Ex falso quodlibet 



A = B C[Aj 
C[B] 



Subst 



A V ^A 
^A 




A 







A 






A B 

A AB 



Al 



AAB 

A 



AEL 



AAB 

B 



AER 



A 

A VB 



VIE 



B 

A VB 



VIR 



A{xp/a-p} 

VxpA 



Ajxp/ax)} 



VI goes with the usual variable condition. In addition all quantifier rules hold 
only for defined variables and terms, that is, not for term of the kind 1/0. I 
don’t give a formal treatment here. A detailed description of a tableau calculus 
for three-valued Kleene logic with soundness and completeness proofs can be 
found in [10], a resolution calculus in [9]. 

Since we use a restricted language without a D connective it is possible to 
make use of the strategy for reusing two-valued theorem proving methods in 
dealing with Kleene logic as found in [11]. That means, any efficient standard 
two-valued theorem provers can made to an efficient theorem prover for the 
approach above just by adding a simple restriction strategy. 

This gives a calculus for first-order Kleene logic in general. In order to handle 
quoted expressions, we have to add axiom schemata that describe the semantics 
of the meta-predicates defined on them. In the case of the truth predicate T, 
we have looked at more closely, this means we add the following axiom schema 
to the axioms (in an efficient implementation we would use a corresponding 
simplification rule of course) . 

Truth: hTrue(“A”)=A 



Example “Liar Sentence”: 

Let us assume a knowledge base consisting of the liar sentence and its definition 
L = -iTrue(“L”), that is formally, the knowledge base is 

r := {L,L = ^True(“L”)} 

by axiom Truth, we know True(“L”) = L, hence by Subst, we get L = ^L. 
We can use this formula to derive by Subst from L the formula ^L. From these 
two we get T by ->E. That is, a knowledge base assuming the liar sentence is 
inconsistent. Note that it is not paradoxical, just as a classical knowledge base 
that contains A and ^A is inconsistent but not paradoxical. 
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If the knowledge base just contains the definition of the liar sentence, but 
does not assert that it holds, that is, 

r := {L = ^True(“L”)} 

we can derive by the definition of truth and Subst, L = ^L. If there is nothing 
else in the knowledge base, nothing else can be derived from this formula. In 
case there are other formulae containing L, we are allowed to replace L by ^L. 
The formula L = is true, since L is undef and undef = undef is true. That 
is, just defining the liar sentence does not cause any problem at all anymore. 
By our remarks above, we get that in general it is not possible to introduce 
contradictions by definitions. 

Example “Self-referential Truth Sentence”: 

Let us now assume a knowledge base 

r := {T,T = True(“T”)}. 



This knowledge base is consistent (just as the knowledge base L' := {^T, T = 
True(“T”)}). Nothing can be derived if someone protests his sincerity. This again 
corresponds to classical logic, where a knowledge base that consists of an atom 
A is consistent as well as a knowledge base that just consists of ^A. 

An Example with Different Agents: 

Assume now a simple knowledge base, in which agent A says “B says the truth”, 
B says “C says the truth” and C says “Joe is the murderer”, let’s furthermore 
assume that A speaks the truth. Formally we have the knowledge base 

r := {True(“A”), A = True(“B”), B = True(“C”), C = True(“mur(ierer(Joe)”)}. 

Expanding the definition of truth, we get A, again by the definition of truth and 
Subst, we get B, in the next step C and finally murderer{J oe) . 

Let us now assume a different knowledge base, where A says “B lies” and 
B says “A says the truth” . Formally the knowledge base consists of F := { A = 
->True(“B”), B = True(“A”)}. By the definition of truth and by Subst, we can 
derive from F the formula A = ^A, a formula that is perfectly true when 
we assume that A is undef. If we add to the knowledge base A, however, the 
knowledge base is inconsistent. In that case, we can derive from the equivalence 
^A as well and from A and ^A we can derive T. 

7 Summary 

We have seen in this work a three-valued approach to allow self-referential sen- 
tences in a formal system almost without any restrictions. The only restriction 
that we have to impose on such a system, is to guarantee that the third truth 
value is a fixpoint of any function that can be constructed by the connectives. On 
the first view this seems to trivialise the system, since by this restriction we have 
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eliminated almost all tautologies from the system. However, when we assume a 
background theory, with respect to which reasoning takes place, this is not a 
serious restriction at all. All formulae in the background theory are assumed to 
be true (this is equivalent to saying, they are not false and not paradoxical). If 
we assume a paradoxical formula in the background theory as true, the system 
becomes inconsistent (and not paradoxical), just as a classical system becomes 
inconsistent by assuming a formula and its negation. 

The natural deduction calculus can be used to derive true sentences from true 
sentences, but also to make and to get rid of assumptions. One complication is 
given in indirect proofs. If an assumption leads to a contradiction, we can’t 
simply assume its negation. This holds only for defined sentences. 

Summarising, we have presented a powerful system for stating facts about 
truth. By adding further axiom schemata it is easy to extend the framework to 
deal with knowledge and belief as well. No strange restrictions on the expressivity 
are made in order to avoid self-referential statements. The system is easier than 
other systems that are based on a fixpoint construction, standard calculi (with 
slight restrictions) can be used. The work also shows a way how to deal with 
paradoxes in higher-order logic without syntactically excluding them. 
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Abstract. Propositional greatest lower bonnds (GLBs) are logically- 
defined approximations of a knowledge base. They were defined in the 
context of Knowledge Compilation, a technique developed for addressing 
high computational cost of logical inference. A GLB allows for polynomial- 
time complete on-line reasoning, although soundness is not guaranteed. 
In this paper we define the notion of k-GLB, which is basically the aggre- 
gate of several lower bounds that retains the property of polynomial-time 
on-line reasoning. We show that it compares favorably with a simple 
GLB, because it can be a “more sound” complete approximation. We 
also propose new algorithms for the generation of a GLB and a k-GLB. 
Finally, we give precise characterization of the computational complex- 
ity of the problem of generating such lower bounds, thus addressing in a 
formal way the question “how many queries are needed to amortize the 
overhead of compilation?” 



1 Introduction 

It is well known that problems in Logic, Automated Deduction and Artificial 
Intelligence are very much demanding from the computational point of view. 
Two of the techniques that have been proposed for addressing such computa- 
tional hardness are Knowledge Compilation (KC) and Knowledge Approximation 
(KA). The central idea of the former technique is to divide in two phases the 
process of answering to the question whether a query is logically entailed by a 
knowledge base or not: In the first phase the knowledge base is preprocessed, 
thus obtaining an appropriate data structure (such a phase is sometimes called 
off-line reasoning)] in the second phase, the query is actually answered using the 
output of the first phase (such a phase is sometimes called on-line reasoning). 
Typically, the output of off-line reasoning is a logical formula in an appropriate 
target language. The goal of preprocessing is to make on-line reasoning com- 
putationally easier (wrt query answering with no preprocessing.) This can be 
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enforced, for example, by choosing a target language which guarantees polyno- 
miality of reasoning. 

In KA the central idea is to give up either soundness or completeness when 
answering to a logical query. Obviously, also in this case one aims to use less 
computational resources than in the sound and complete case. The two tech- 
niques can be used together: In such a case we have a method for approximate 
KC. 

A method for approximate KC has been proposed by Selman and Kautz 
in several papers. In [SK9I] they define KC of propositional formulae, the tar- 
get language being the set of propositional Horn formulae (which admit well- 
known polynomial-time algorithms for inference [DG84]). Compiling a propo- 
sitional formula S delivers two distinct Horn formulae Eu, and Eub such that 
Eib ^ A ^ Euh- Eib is called a Horn lower bound of E, while Eub is called 
its Horn upper bound. As an example (cf. [SK91]), if E is {master student V 
phdstudent) A {master student student) A {phdstudent student), then 
{master studentAphdstudentAstudent) is a Horn LB of E, and {master student 
student) A {phdstudent student) is a Horn UB of E. In general, Eib and 
Eub can be chosen in many different ways. A Horn LB Eib such that there is 
no other Horn LB E' such that Eib \= E' and E' ^ Eib is called a “greatest” 
Horn lower bound (GLB). Referring to the previous example, {master student A 
student) is a Horn GLB. This shows that a Horn GLB Egib is a “complete and 
incorrect” approximation of E, since it might be the case that E ^ Egib, or, 
in other words, the set of models Egib is a strict subset of the set of models of 
E. “Least” Horn upper bounds (LUB) are defined dually, and are “correct and 
incomplete” approximations. 

In [SK96] , the target language is generalized to other polynomial classes of 
propositional formulae, and two algorithms for generating, resp., a GLB and a 
LUB are provided. The reliability of the approximations (in terms of the per- 
centage of right answers obtained using the two bounds) is studied in [KS94]. 
Other papers dealing with propositional approximate KC are [dV95,Sch96]. 

In the present paper, we focus on lower bounds, and our goal is to advance 
the state of the knowledge on two specific aspects: 

1. definition of a more general target language, 

2. study of computational properties of the compilation process. 

As for the former aspect, we define the notion of k-GLB, which is basically the 
aggregate of several LBs that retains the property of polynomial-time on-line 
reasoning. We show that it compares favorably with the GLB, because it can be 
a “more sound” complete approximation. 

As for the latter aspect, our main contributions are: 

— new algorithms for the generation of a GLB and a fc-GLB, 

— precise characterization of the computational complexity of the problem of 
generating such lower bounds. 

Although KC tries to shift the burden of query answering to off-line reasoning, 
in general one may ask “how many queries are needed to amortize the over- 
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head of compilation?” Characterization of the computational complexity of the 
problem of generating a good LB gives a formal answer to such a question. So 
far, complexity of generation of a GLB has been partially addressed only in the 
context of Horn GLBs [SK96,Cad93]. 

2 Preliminaries 

In this section we first give the notions of lower bound and greatest lower bound 
of propositional theories, following [SK96] . Then we introduce a suitable family 
of classes of propositional theories and study some general properties about such 
classes. Finally, we define fc-GLBs. 

We assume theories are propositional formulae given in conjunctive normal 
form (CNF), so they can be represented by sets of clauses. We call “source 
theory” a theory whose LBs we are interested in. A clause is a disjunction of 
literals (propositional letters or their negation), and can be represented by the 
set of literals it contains. We use either the set notation or the logical notation, 
when no confusion arises. Given a theory H, A4{S) is the set of its models, 
defined in the standard way. 

Definition 1 (LB and GLB of a theory [SK96]). Let I? be a CNF theory 
and 0 be a class of CNF theories. 

— A theory Eit, belonging to 0 is a 0-LB (lower bound) of E C A4(E) 

(i.e.. Sib \= S). 

— A theory Sgit belonging to 0 is a 0-GLB (greatest lower bound) of S if there 
exists no 0-LB Sib of S such that M{Sgib) C M{Sib) C A4{S). 

If no confusion arises, we skip the prefix theta when referring to LBs and GLBs. 
A 9-clause is a clause which belongs to class 6. 

We list some general properties of the classes 6 of CNF theories which are 
the object of compilation: 

complexity of inference problem: how much does it cost to decide whether 
a clause 7 logically follows from a GLB of a theory SI We require that such 
a problem is feasible in polynomial-time in the size of S plus the size of 7 
(cf. previous section). This clearly implies the size of a GLB is bounded by 
a polynomial in the size of S. 

complexity of recognition problem: how much does it cost to decide whether 
a formula belongs to class 9? Typically, this is a polynomial-time problem. 
Nevertheless, such a condition will be relaxed in the algorithms we show in 
the next section. 

syntactic restrictions: 

unit clauses: if IGNF theories (i.e., theories with only single literal clauses) 
belong to 9, we say 0 is a target class. 

closure under conjunction and resolution: if such properties hold for a 
target class 9, then we say 9 is an L-class (because it has some “locality” 
properties, cf. Section 3). 
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Most classes of propositional theories used in Knowledge Representation, e.g., 
Horn, Dual Horn, and 2CNF, are L-classes. Although we always assume that 9 
is a target class, in the next section we show that assuming that 6 is an L-class 
leads to a simpler compilation problem. It is interesting to notice that some 
target classes with polynomial-time inference and recognition problems, closed 
under resolution, are not closed under conjunction (cf. e.g., “balanced” theories 
[CC95]). 

A list of general properties of the compiled theories which will be dealt with 
in the sequel of this paper follows. 

covering: is the logical “or” of all GLBs equivalent to the source theory (i.e., 
does the union of the models of the GLBs cover the models of the source 
theory)? Govering is a weak form of soundness, as it guarantees that there 
are no inferences that are a priori lost. 

satisfiability: does a satisfiable GLB of a satisfiable source theory always exist? 
complexity of GLB checking: how much does it cost to know whether a for- 
mula is a GLB of a given source theory? 
complexity of the generation problem: how much does it cost to compute 
a GLB of a given source theory? 

number of GLBs: how many GLBs are there for a given source theory? 

We are able to obtain for target classes results similar to those shown in [SK96] 
for Horn formulae. Sn, and Egit, denote respectively an LB and a GLB of a 
source theory E belonging to some fixed target class. 

Proposition 2. 1. The union of the models of all possible Su, equals M{S). 

2. The union of the models of all possible Sgn, equals A4(E). 

3. S is satisfiable if and only if there is at least one satisfiable Sgn, . 

4-. Checking if a theory S* belonging to some target class is a GLB of E is 
coNP- complete. 

5. Finding a GLB of E belonging to a target class with a polynomial inference 
problem is NP-hard. 

Proof. 

1. Let M & J\A{E). The conjunction of all the literals assigned true by M is a 
Ldib. 

2. Straightforward. 

3. (^) From point 1. (y=) Let A' be a GLB of E. Then A4{E') C JH(E) and 

since E' is satisfiable M{E') 0. 

4. (coNP-hardness) Let I be a literal and E* = I A ~^l. E* belongs to the target 
class and is not satisfiable. From point 3 it easily follows E* is a GLB if 
and only if E is not satisfiable. (Membership to coNP) Since the size of a 
GLB of E is always bounded by a polynomial in |I?|, we can use a simple 
guess-and-check algorithm. 

5. Deciding whether E is satisfiable or not reduces to the problem of finding a 
GLB Egib and to decide if Egn, \= I A^l. The latter problem is polynomial. 
□ 
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As for point 5, NP-hardness is just a lower bound to the complexity. In the 
rest of the paper, upper bounds in different cases will be provided. As for the 
number of Horn GLBs, in [SK96] it is proven that in general there are exponen- 
tially many (wrt the size of E), but we do not know whether this holds for any 
target class. Anyway, if they were just polynomially many, since their disjunction 
“covers” E (Proposition 2.2), and inference from a GLB is a polynomial-time 
task, we could “compile” the source theory and answer any query in polyno- 
mial time, which, using the techniques shown in [KS92], would imply that the 
polynomial hierarchy collapses at the second level. 

As shown in [SK96] , the key idea in generating a GLB is to choose an appro- 
priate strengthening of each clause C of E, i.e., to choose which literals delete 
from C. The formal definition follows. 

Definition 3 (^-strengthening [SK96]). Let 0 be a target class. A 0-clause 
Cg is a 0-strengthening of a clause G if Ge C G and there is no 0-clause Cg such 
that Cg G Cg C C. 

The first contribution of the present paper is to generalize Definition 1 to a 
/c-tuple of LBs. 

Definition 4 (/c-LB and /c-GLB of a theory). Let H be a GNF theory, 0 be 
a class of GNF theories, and /c be a fixed number. 

— A /c-tuple of theories Efg, . . . , Efg belonging to 0 is a 9-k-LB of E if M{Elg) C 

M(E) for each i {I < i < k). 

— A /c-tuple of theories Efg, . . . , Efg belonging to 0 is a 9-k-GLB of E if there 

exists no 9-k-LB E^,...,E’^ of E such that M{Elg) U ••• U A4(Eh) c 

M(E^)U---UM(E^) QM(E). 

Referring to the example in Section 1, E^ = {phd student A student), E^ = 
{master student A student) is a Horn-2-GLB of E. 

A k-LB or k-GLB E^,...,E>^ can be used for on-line reasoning, since for 
each formula C, E^ V ■■■V E>^ {= C iff {E^ ^ C) A ■ ■ ■ A {E>^ ^ C). If k is 
bounded by a polynomial in the size of E, then on-line reasoning using a fc-LB 
or fc-GLB is still a polynomial-time task. 

It is interesting to note that in a fc-GLB Efg, . . . , Efg, a theory E^g does not 
have to be a GLB (cf. forthcoming Example 16). Moreover, although for each 
/c-GLB Egig of E there always exist k distinct GLBs that capture the same set 
of models as E^ig, such k GLBs cannot, in general, be chosen arbitrarily (cf. 
forthcoming Example 17). 

Moreover, observe that in principle, in Definition 4, Efg, . . . , Efg may belong 
to distinct classes 0. 

In the rest of the paper we refer to the following complexity classes for de- 
cision problems: P, NP, coNP, = P'^^, E^= NP'^^. We also refer to the 
corresponding classes for search problems, by adding the prefix “F” (e.g., FNP, 
FE 2 ) [Joh90]. Intuitively, an algorithm in FNP is a polynomial-time algorithm 
which uses a polynomial number of calls to an NP oracle (i.e., a procedure which 




360 Marco Cadoli, Luigi Palopoli, and Francesco Scarcello 



is able to solve an NP problem at unary cost). In an FE 2 algorithm, the or- 
acle is able to solve problems, i.e., it has the power of an NP machine, 

with an NP oracle. In practice, a FNP algorithm can be implemented with a 
polynomial-time algorithm and a subroutine being able to solve an NP-complete 
problem. 



3 Algorithms and complexity of generating a GLB 

In this section we illustrate the following results: 

— we show that the computation of a 0-GLB of a CNF theory can be done in 
V S 2 under very weak conditions on 0; 

— we prove some properties of 0-GLBs by which, under some additional con- 
ditions on 0, the computation of a 0-GLB of a CNF theory can be done in 
FNP. 

We begin with the first item above. 

Proposition 5. Let 9 be a target class with a A 2 recognition problem. Then 
finding a 6-GLB of E is in FE 2 ■ 

Proof. (Sketch) As mentioned in Section 2, for each 9 there is a polynomial p 
such that the dimension of a 0-GLB of E is less than or equal to p{n), where n is 
the size of E’s alphabet. If we assume no tautology and no redundancies occur 
in the GLB, we can compute it using at most 2np{n) calls to an oracle in E 2 . 
In fact, since we cannot have more than p{n) clauses in the GLB and each letter 
in 17 s alphabet can appear at most once in each clause, at the generic oracle 
call we shall have a set of clauses consisting of (1) a set of clauses E' we have 
already decided to be part of a 0-GLB we are constructing and (2) a (partially 
formed) clause C". Then, for each literal I, we ask the oracle whether there exists 
a 0-GLB of r such that E' U {C} C where C" U {?} CC. □ 

In some cases, we can improve the upper bound given by Proposition 5, 
exhibiting a FNP algorithm. 

Proposition 6. Finding a ICNF-GLB of E is in FNP. 

Proof. (Sketch.) The proof immediately follows from noting that any IGNF the- 
ory is a IGNF-GLB of E iff it is a prime implicant of E. □ 

On the other hand, IGNF-GLBs are not able to cover the models of the 
source theory very well. In fact, if / is a IGNF-GLB of E, then for each target 
class 9, I is a 0-LB of E. 

Goming to the second goal of this section, we show sufficient conditions on 
9 that guarantee, for the problem of finding a 0-GLB, the same upper bound to 
complexity as the IGNF case. 

The basic assumption we make is that 9 is an L-class. In such a case, it is pos- 
sible to compute a GLB “a clause at a time”, hereby obtaining an FNP method. 
In fact, for L-classes we can generalize Lemma 2 of [SK96] in the following way. 
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Lemma 7. Let 9 be an L-class and be a 9-GLB of a CNF theory S = 
{Cl, . . . , C„}. Then there exists a set of clauses {C{, . . Cf} where each C' is 
a 9 -strengthening of Ci and such that = {C[, . . . , Cf}, 

The Lemma 7 allows us to find a 0-GLB by choosing its clauses among its 
0-strengthenings. If a of E is composed only of 0-strengthenings of clauses 
in E (as in the Lemma above), we say that E^^^ is in normal form, short NF. 

From this important property of L-classes, we derive the following technical 
results. For any theory E, let IM{E) denote the set of its implicants. 

Lemma 8. Let E be a CNF theory, Ci one of its clauses, and 9 an L-class. Lf 
there exists an implicantp € LM{E) and a literal I G Ci such that pC\Ci = {?}, 
then, for each 9-GLB in normal form E' for E s.t. p E' , the 9 -strengthening 
C[ G E', corresponding to Ci, must include the literal 1. 

Proof. Straightforward. □ 

Our FNP procedure for computing a of E for any L-class 0 is shown in 
Figure 1. 

Lemma 9. Let E be a CNF theory, 9 an L-class, L a set of implicants for E, 
C a clause of E, and I G C a literal such that for each implicant p G LM{E) 
containing I, either p C\ C ^ {1} , or GlobalC omp{I C {p} , E) is false. Then, any 
9-GLB E' for E such that I C IM(E'), is also a 9-GLB for E A (C \ {1}). 

Proof. (Sketch) Assume, w.l.o.g., that E' is in NF and observe that E' \= E A 
(C\{Z}). This is an consequence of the hypothesis on I, the fact that I C LM(E'), 
and the assumption E' is composed by 0-strengthenings of E. 

Since E A {C \ {?}) \= E, it is easy to verify that E' is in fact a 0-GLB of 
EA(C\{1}). □ 

The following theorem states the correctness of ComputeGLB. 

Theorem 10. Let E be a CNF theory and 9 be an L-class of target theories. 
Procedure Gompute-GLB computes (in the input/output parameter E) a 9-GLB 
of the source theory E. 

Proof. (Sketch) Denote by E^ the theory obtained at the end of Gompute-GLB. 
First, note that E^ belongs to 0. At each step of the outer cycle of the algorithm, 
we select a clause C G E and compute a clause C G 9. Indeed, 0 is an L-class, then 
unit clauses belongs to 0, and 0 is closed under resolution. As a consequence, 
if C belongs to 0, any non-empty clause C” C C' belongs to 0 as well. By 
construction, C is a subset of some “candidate” clause C G 9, thus C G 9. 
Furthermore, 0 is closed under conjunction, hence E^ belongs to 0, and it is 
clearly a 0-LB of E. Moreover, by construction, no implicant p G I can violate 
any clause of E^, hence, I C IM(E^). 

Let the input theory E be the conjunction (or set) of clauses Ci A . . . A C„, 
and assume, w.l.o.g., that clauses already belonging to 0 are those with higher 
indexes. Denote by = C( A ... A Cf the value of the parameter E at the 
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beginning of the sth execution of the outer for cycle, i.e., the step in which the 
clause Cs is selected. 

We proceed by contradiction. Assume is not a 0-GLB of the source theory 
E. Then, there exists a 0-GLB E' of E such that E^ \= E' E, and E' ^ A®. 
By Lemma 7, we can assume, w.l.o.g., that E' = C[ A - ■ -ACf, where each clause 
Cfc) for 1 < /c < n is a ^-strengthening of the corresponding clause Ck belonging 
to E. 

Since E' ^ A®, there exists a clause belonging to E^ violated by some 
implicant of E' . Let i be the least index in {!,..., n} s.t. the clause Cf is 
violated by some implicant of E' . Such a clause Cf is computed at step i of the 
execution of the outer for cycle of procedure Compute-GLB. Now, consider the 
theory A* computed before such a step is executed. By construction, for each 
k < i, we have Cl = Cf, and for each j > i, the clause C* of A* is identical to 
the corresponding clause Cj of the original input theory E. By the choice of i, 
no clause Cf, for k < i, can be violated by any implicant p' G IM(E'), hence, 
E' \= A* clearly holds, and IM(E') C IM(E^). In fact, it is easy to see that E' 
must be a 0-GLB of A*. 

Now, let Ci and Ci be the value of variable C and C, respectively, at the end 
of the inner for, where the clause C* of A* is selected. Thus, we have Ci = Cf. 
Since 1 C IM(E'), by Lemma 9, Lemma 8, and Lemma 7, it follows that there 
exists a 0-GLB E” of A* A Ci which is equivalent to E' , and which is in normal 
form. Let Cf be the ^-strengthening of Ci belonging to E” , hence, we have 
Cf C a. 

If there exists a literal I G Ci s.t. I ^ Cf , then there exists an implicant p' G I 
s.t. p' n Ci = {?}. However, such a p' would violate Cf , which is a subset of Ci. 
This is clearly a contradiction, since I C IM{E”). 

Otherwise, assume Ci C Cf C Ci. Since each literal selected in the inner for 
is deleted from Ci or belongs to Ci, for each literal I' G {Cf \Ci), such a literal 
V does not belong to any C G Candidatesg{Ci, I) s.t. {Ci U {?'}) C C'. This 
contradicts the existence of Cf , which should belong to Candidatesg{Ci, I), and 
thus the existence of the GLB E” , as well. □ 

The following result states the complexity of procedure Compute-GLB. 

Theorem 11. Let E be a CNF theory and 0 be an L-class of target theories. Lf 
the set of 9-strengthenings is polynomial time computable, then finding a 9-GLB 
of E is in FNP. 

Proof. (Sketch) In each step of the outer cycle of the procedure Compute-GLB, 
a clause Ci G A is selected and a 0-clause Ci C Ci for it is computed. Such clause 
will be not selected any more. Moreover, any literal belonging to Ci is selected 
at most once, thus the inner cycle in the procedure is executed |Ci| times, where 
|Ci| denotes the number of literals in Ci. Hence, if the number of clauses in E 
is n, the procedure Compute- GL5 terminates after at most \^i\ execution 

of the if statement appearing in the inner cycle. 

Now, observe that the cardinality of the set L, which is the name of a pa- 
rameter of both Candidatesg and ClobalComp, is always < YfJi=i and. 
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INPUT: A CNF theory E without tautological clauses and an L-class 6. 
OUTPUT: A e-GLB of E. 



Procedure ComputeGLB (var E: Theory) 

Function Candidatese {C: Clause, I: Set of Implicants) : SetOfClauses 
begin 

return each ^-strenghtening C' of C s.t., Wp € I, C' 0 p 7 ^ 0; 
end; 

Function GlobalComp{I: Set of Implicants, E: Theory): Boolean; 
begin 

if VC € E CandidateseiC , I) 7 ^ 0 then return true 
else return false 
end; 

begin 

/:= 0 ; 

for each non- 6 * clause C £ E Ao 
begin 

C:=0; ^ 

for each l£ (C\ C) s.t. 3C' € Candidatese{C,I) s.t. CU {/} C C' do 
if VC' G CandidateseiC, I) s.t. C C C' we have I G C' 
then C := C U {?}; 

elsif 3p G IM(E) s.t. p n C = {1} and GlobalComp{I U {p}, E) 
then begin 

C:=CU{?}; 

7:= I U {p} 

end 

else C:= C \ {?}; 

C :=C-, 
end; 

output E 



Fig. 1. The FNP algorithm for computing a 0-GLB. 
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hence, polynomial in the size of the input theory E. Moreover, by hypothesis, 
Candidatesg is polynomial time computable, hence all the executions of either 
Candidatesg or GlobalComp are polynomial time bounded. This entails that 
every execution of the etez/ statement in the procedure can be performed by a 
unique call to an NP oracle, and thus all the procedure is in FNP. □ 

As a consequence, finding a GLB of a source theory which is Horn, Dual 
Horn, or 2CNF, can be done in FNP. Note that, for all such classes, the set of 
0-strengthenings of any clause is very easy to compute. 

The following example shows an application of the procedure Compute- GLB 
for the computation of a 2CNF-GLB. 

Example 12. Let 17 = Ci A C2 A G3 be the following GNF theory: 

{by a\f c) /\ {b\f d) /\ {a\J d\f c) 

We apply procedure Compute-GILQ to find a 2CNF-GLB of E. First, we con- 
sider the clause Gi = 6 V a V c and let (7 = 0. 

We begin by selecting the literal b a C\. We find that there exists an im- 
plicant of E, pb = {b,d}, such that pb Pi C\ = {b}. It is easy to verify that 
GlobalComp{{pb}, E) returns true, then we set C := {6} and I := {pb}- Next, 
we select literal a G Ci. For this literal, there exists no implicant p € IM(E) 
such that p D Cl = {a}. Then, we can drop literal a and we get Ci := 6 V c. 
Next, we select literal c. Since c belongs to every clause in the (singleton) set 
Candidates0{Ci, I), it is immediately added to C. 

At this point, we have computed a 0-clause C = bV c, which thus replace Ci 
in E. Then, we have the following theory E': 

(6 V c) A (6 V a) A (a V d V c) 

Moreover, the compatible candidates for I are the following: Candidatesg{Ci, I) 
= {bye}, Candidates0{C2, 1) = {6Va}, and Candidates 0{Cz, I) = {dVc, aVd}. 
It is easy to see that the procedure Compute-GEQ leaves untouched the second 
clause. For the clause C3, we begin by selecting literal a. For a, there exists an 
implicant Pa = {a, b} such that PanC3 = {a}. Now, GlobalComp{{pb,Pa}, re- 
turns true; therefore, we set C := {a} and I := {pb,Pa\. Now, Candidates 0{Cz, I) 
= {a V d}. Therefore, d is immediately added to C. 

Thus, we finally get the following 2GNF-GLB of E\ 

{by c) A {by a) A {ay d) 

It is worth noting that, by our results, for any L-class 0, the complexity of 
finding a 0-GLB remains the same as for the simplest case of IGNF, provided 
that the set of 0-strengthenings of any clause is polynomial time computable. 

Gompared with the procedure proposed in [SK96], the latter method repre- 
sents an obvious improvement as far as the theoretical complexity is concerned, 
since the former, based on an enumeration technique, works in exponential time. 
Of course, if P NP, an effective implementation of our method cannot work in 
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polynomial time as well, but it shows the problem is not intrinsically exponen- 
tial and many optimizations can be made exploiting particular features of the 
selected target class. 

Compared with our first algorithm working in FS 2 , our method is more 
efficient but less general, because it requires 6 be closed under conjunction and 
resolution. 



4 Properties of fc-GLBs 

In this section we investigate the use of fc-GLB as target theories for knowledge 
compilation. We begin by stating the complexity of checking 0-fc-GLBs. Then, 
we show that, from a complexity theoretic point of view, finding a 6-k-GLB for 
a general target class 9 in doable in and, as such, is not more difficult than 
finding a 0-GLB. Finally, we concentrate on semantical considerations and show, 
also using some examples, that employing 0-fc-GLBs often compares favorably 
with respect to using 0-GLBs. 

Proposition 13. Let E he a CNF theory and Ei,. . ,,Ek be a k-tuple of theories 
belonging to some target class 9. Checking if E = Ei, . . . , Ek is a 9-k-CLB of E 
is coNP-complete. 

Proof. (coNP-hardness) Let Oi, . . . , be k letters from E. Then E is not sat- 
isfiable iff {oi A af|l < i < k} is a 9-k-GLB of E. (Membership to coNP) Since 
the size of the GLBs of E is always bounded by a polynomial in IHI and k is 
fixed, we can use a simple guess-and-check algorithm. □ 

Such a proposition corresponds to Proposition 2.4 in the case of /c-GLBs. It 
allows us to provide a simple FE 2 algorithm for generating a 9-k-GLB. 

Proposition 14. Let E be a CNF theory, k be a fixed number and 9 be a target 
class with a A 2 recognition problem. Then finding a k-9-CLB of E is in FEP. 

Proof. The proof is analogous to that of Theorem 5. The main difference is that, 
in this case, we need at most 2nkp{n) oracle calls. □ 

Thus, from a complex! ty-theoretic viewpoint, for general 9 theories, con- 
structing a 9-k-GLB is not more difficult than computing a 0-GLB. On the 
other hand, employing fc-GLBs in the place of GLBs for knowledge compilation 
purposes can be advantageous from the semantical viewpoint. Indeed, the follow- 
ing proposition shows that, in general, fc-GLBs allows to capture more models 
of the source theory than GLBs. 

Proposition 15. Let E be a CNF theory and Egib ^ , . . . , Egu,,. be k distinct 9- 
CLBs of E. Then there exists a k-CLB E^^^ of E such that Ai(Egii,j^) U . . . U 
M{Egi,,) C M{E^g^,). 

Proof. The proof is immediate by noting that either Egii,^ , . . . , Egu,^ is itself a k- 
GLB of E or otherwise there exists a fc-LB E^^^ of E such that Egn,^V- ■ -VEgu,^ ^ 

h ° 
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Note that, in fact, there are cases where a fc-GLB contains several LBs which 
are not GLBs, as shown in the following example. 

Example 1 6. Let E be the theory {cV d) A {aV b) A {aV c) and assume the target 
class is that of implicants. Then, the theory (a A c) V (a A & A c) V (a A 6 A c) is a 
1GNF-3-GLB for E and neither (a A b Ac) nor (d A bA c) are GLBs for E. 

It is obvious, however, that for each 0-k-GLB E^^^ of E there always exist k 
distinct GLB's that capture the same set of models as E^^^. Nevertheless, these 
k GLBs cannot, in general, be chosen arbitrarily. This fact is proved by the 
following example. 

Example 1 7. Let E be the theory (a V & V d) A (& V c V d) A (a V c). Assume our 
target class is that of implicants. On the one hand, consider the 1GNF-4-GLB 
A'gjf, = (aAcAd) V (&AcAd)V (oAcAd) V (oA&Ad). On the other hand, consider 
the following four GLBs: Egu,^ = (aAcAd), Egu,^ = (aAbAc), Egu,^ = (bAcAd) 
and Egib^ — {d A b A d) . It is the easy to see that M{E) = M(Egif^) and that 
M{Egii,^)UA4{Egib2)tJM{Egib^)UM{Egib^) does not include the following three 
models of E: Mi = {d,b,c,d}, M 2 = {d, &, c, d} and M 3 = {d, &, c, d}. Moreover 
note that we need at least two further GLBs to be added to the four cited above 
to capture Mi , M 2 and M 3 . 

5 Conclusions 

In this paper we have defined fc-GLBs, generalizations of propositional GLBs, 
and we have investigated the computational complexity of the problems of find- 
ing a GLB and a /c-GLB. For both of them, we have shown a general upper bound 
of FAf . As for GLBs, we have exhibited an algorithm that works in FNP. Such 
an algorithm is proven to be correct for L-classes, i.e., classes of propositional 
formulae which are closed under conjunction and resolution. Restriction to L- 
classes is important, since there are classes (e.g., “balanced” theories cf. [GG95]) 
which are not closed under conjunction. The question whether the problem of 
generating a /c-GLB is intrinsically more complex than the one of generating a 
GLB is still open. 
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Abstract. Generalization is a fundamental operation of inductive in- 
ference. While first order syntactic generalization (anti-unification) is 
well understood, its various extensions are needed in applications. This 
paper discusses syntactic higher order generalization in a higher order 
language A2[l]. Based on the application ordering, we proved the least 
general generalization exists and is unique up to renaming. An algo- 
rithm to compute the least general generalization is presented. 



Keywords: Higher order logic, unification, anti-unification, generalization. 

1 Introduction 

The meaning of the word generalization is so general that we can find its occur- 
rences in almost every area of study. In computer science, especially in the area of 
artificial intelligence, generalization serves as a foundation of inductive inference, 
and finds its applications in diverse areas such as inductive logic programming 
[10], theorem proving [12], program derivation [4] [5]. In the strict sense, general- 
ization is a dual problem of first order unification and is often called (ordinary) 
anti-unification More specifically, it can be formulated as: given two terms t 
and s, find a term r and substitutions 9i and 6 * 2 , such that r9\ = t and r 02 = s. 
Ordinary anti-unification was well understood as early as in 1970 [13] [15]. Due to 
the fact that it is inadequate in many problems, there are extensions of ordinary 
anti-unification from various aspects. 

One direction of extending the anti-unification problem is to take into consid- 
eration of some kinds of background information as in [10]. One typical example 

^ The words generalization and anti-unification are often used interchangeably. Here 
we will use anti-unification to denote the pure syntactic first order anti-unification, 
i.e., instantiation as the ordering, Robinson’s formulation as the language. We use 
generalization to denote its various extensions. 
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is the relative least general generalization under 6 subsumption [14]. There are 
various generalization methods in the area of inductive logic programming. More 
recently, there are generalizations under implication [8], and generalizations in 
constraint logic[llj. 

Another direction of extension is to promote the order of the underlying 
language. The problem with higher order generalization is that without some re- 
strictions, the generalization is not well-defined. For example, the common gen- 
eralizations of Aa and Bb without restriction would be: fx, fa, fb, fab, fA, fB, 
..., f{Aa, Bb), f{g{A, B), g{a, b)), ...., where / and g are variables. Actually, there 
are infinite number of generalizations. Obviously, some restrictions must be im- 
posed on higher order generalization. 

This paper is devoted to the study of higher order generalization. More 
specifically, we study the conditions under which the least higher order gen- 
eralization exist and unique. The study is directly motivated by our research on 
analogical(inductive) programming and analogical (inductive) theorem proving. 
The most closely related works are [12] [3]. 

[12] studied generalization in a restricted form of calculus of constructions 
[2], where terms are higher-order patterns, i.e., free variables can only apply 
to distinct bound variables. One problem of the generalization in higher-order 
patterns is the over generalization. For example, the least generalization of Aa 
and Ba would be a single variable x instead of fa or fx, where we suppose 
A, B, a are constants, and /, x are variables. Another problem of higher-order 
pattern is that it is inadequate to express some problems. In particular, it can 
not represent recursion in its terms. 

This motivated the study of generalization in MA [3]. In MA, free variables 
can apply to object term, which can contain constants and free variables in 
addition to bound variables. In this sense, MX extends LX. On the other hand, 
it also added some restrictions. One restriction is that MX is situated in a simply 
typed A calculus instead of calculus of constructions. Another restriction is MX 
does not have type variables, hence it can only generalize two terms of the same 
type. The result is not satisfactory in that the least general generalization is 
unique up to substitution. That means any two terms beginning with functional 
variables are considered equal. 

Unlike the other approaches, which mainly put restrictions on the situated 
language, we mainly restrict the notion of the ordering between terms. Our 
discussion is situated in a restricted form of the language A2[l]. The reason 
to choose A2 is that it is a simple calculus which allows type variables. It can 
be used to formalise various concepts in programming languages, such as type 
definition, abstract data types, and polymorphism. The restriction we added is 
that abstractions should not occur inside arguments. In the restricted language 
A2, we propose the following: 



— an ordering between terms, called application ordering^denoted as ^), which 
is similar to, but not the same as the substitution (instantiation) ordering 
[15] [12]. 
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— A kind of restriction on orderings, called subterm restriction (the correspond- 
ing ordering is denoted as A 5 ), which is implicit in first order languages, but 
usually not assumed in higher order languages. 

— An extension to the ordering, called variable freezing (the corresponding 
ordering is denoted as '^sf), which makes the ordering more useful while 
keeping the matching and generalization problems decidable. 

— A generalization method based on the afore-mentioned ordering. 

Based on the '^sf ordering, we have the following results similar to the first 
order anti-unification: 

— For any two terms t and s, t '^sf s is decidable. 

— The least general generalization exists. 

— The least general generalization is unique up to renaming. 



2 Preliminaries 

The syntax of the restricted A2 can be defined as follows [1]: 

Definition 1 (types and terms). The set of types is defined as: 

V = {a, ai, 02 , ...}, (type variables), 
c = {7.71.72, •••}, (type constants), 

T = V\C\T^T\[V]T, (types). 

The set of terms is defined as: 

X = {x,X\,X 2 , ■■■}, (variables), 

A = (a, oi, 02 , ...}, (constants), 

Ai = A|A|AiAi|AT , (terms without abstraction), 

A = Ai\[X : T]A|[T]A , (terms). 

Here for the purpose of convenience, we use [x : ct] instead of \x : a. Also, 
we use the same notation [V] to denote AV (and VT), since we can distinguish 
among A, A and V from the context. 

The assignment rules of A2 are listed here for ease of reference: 

Definition 2. Let cr, 7 are types. T h t : cr is defined by the following axiom 
and rules: 

(start) r \- X : a, A {x : a) G F; 

r \- t : {a ^ t), r h s : a 

(- E) 

r \- ts : T 

r,x : a \- t ■. T 

(- I) 



r \- [x : a]t : {a ^ r) 




Higher Order Generalization 371 



r h t : [a]cr 

(yE) 

r \- tr : a[a := r] 

r \- t : a 

(V7) ifa^FF(r). 

r h [a]t : [a]a 

We call a term t is valid (under E) if there is a type a such that F \- t : a. We 
use Typ{t) to denote the type of t. Atoms are either constants or variables. By 
closed terms we mean the terms that do not contain occurrences of free variables. 
In the following discussion, if not specified otherwise, we assume all terms are 
closed, and in long [dr] normal form. Given A = [xi : cri][a:2 : 0-2]... : ct„] and 
term t, [A]t denotes [x\ : cri][a;2 : (X2\--\Xn ■ When type information is not 
important, [x : a]t is abbreviated as [x\t. 

Following [12], we have a similar notion of renaming. Given natural numbers 
n and p, a partial permutation (j) from n into p is an injective mapping from 
{1, 2, ..., n} into {1, 2, ...,p}. A renaming of a term [x\ : cri][a;2 : a2]...[xp : ap\t is 
a valid and closed term [x^i^i) : ct0(i)] [ x0(2) : cr^(^2)]---[x^(n) ■ cr^(n)]t. Intuitively, 
renaming is to permute and to drop some of the abstractions when allowed. For 
example, [a;3,a;i : ^AxiXz is a renaming of [xi,X2,xz : 

3 Application orderings 

3.1 Application ordering (^) 

Definition 3 ( ^ ). Given two terms t and s. t is more general than s (denoted 
as t F s) if there exists a sequence of terms and types ri, r2, ..., r^, such that 
trir2...rk is valid, and trir2...rfe = s. Here k is a natural number. 

To distinguish ^ with the usual instantiation ordering (denote it as >), we 
call ^ the application ordering. Gompared with the instantiation ordering, the 
application ordering does not lose generality in the sense that for every two 
terms t and s in A2, if t > s, and t\ and Si are the closed form of t and s, then 
ti hr Si, where Fp is defined in section 3.3. 

Example 4 - The following are some examples of the application ordering. 

[a][f :a^a^ a][x,y: a]fxy 
h if ■■ 1 ^ ^ l][x,y : -f] fxy 

h [x,y : ^]Axy 
h [y ■■ l]Aay 
^ Aab. 

F is reflexive and transitive: 

Proposition 5. For any terms t,ti,t2,h> 
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1. t y t; 

If hh h, h y ts, then t\ ytz- 
Proof: 

1. Trivial. 

2. Since ti y t2, t2 h ts, there are a sequence of terms or types rn, ri2, •••, ri„, 

r2i,T22, ■■■,r2m, such that tirnri2...ri„ = t2, and t2r2iT22---r2m = h, hence 
firiiri2...ri„r2ir22...r2m=f3- □ 



3.2 Application ordering with subterm restriction (^ 5 ) 

Because ^ is too general to be of practical use, we restrict the relation to 
called subterm restriction. First of all, we define the notion of subterms. 

Definition 6 (subterm). The set of subterms of term t (denoted as subtermft)) 
is defined as decm{norm{f)) U {Typ{r')\r' G decm{norm{f))} . 

Here normft) is to get the p-rj normal form for the term t. decm{r) is to decom- 
pose terms recursively into a set of its components, which is defined as: 

1. decm{c) = {c} (constants remain the same); 

2. decm{z) = {}; (variables are filtered out); 

3. decm{ts) = decm{f) U decm{s) U {ts}, if there is no variable in ts] 

= decm{t) U decm{s), otherwise; 

4. decm{[d]t) = decm{f). 



Example 7. Assume A : j ^ j ^ j, B : j ^ j, 

subterm{[x : 'y\Axa) = {[x,y ■. 'j]Axy, a, 7, 7 — > 7 — > 7} 
subterm{[f : 7 — > 7][a: : 7]/(i?a;)) = {[x : 7]i?a;,7 ^ 7}. 

As we can see, the subterms do not contain free variables. Actually, there is 
no bound variables except the term having its 77 normal form (the [x,y : ^]Axy 
in the above example). Here we exclude the identity and projection functions 
as subterms. This is essential to guarantee there exists least generalization in 
the application ordering. The intuitive behind this is that when we match two 
higher order terms, in general there are imitation rule and projection rule [6]. 
Here only imitation rule is used. We regard it is projection rule that brings about 
the unpleasant results and the complexities in higher order generalizations. 

Definition 8 (^s). Given two terms t and s. t is more general than s by sub- 
terms (denoted as t s), if there exists a sequence of ri,r2, ■■■,rk, such that 
trir2---rk = s. Here G subterm(s), i G {1, 2, ..., k}, and /c is a natural number. 



Example 9. [/][a;]/a; Aa; 

[f][x]fx hs Bbc] 

[a][x : a]x A5 [x : jjx A5 Aa\ 

[/][^]/^ CL, since the only subterm of a is a. 
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Due to the finiteness of subset{s), the ordering ^5 becomes much easier to 
manage than 

Proposition 10. For any terms ti,t2,ts, 

1 . There exists a proeedure to decide ifti ^5 t2- 

2 . Ifh hs h, t2 hs h, then h hs h- 



Proof: 

1 . Since the subterm of t\ is finite, it is obvious. 

2 . Since ti ^2, ^2 hs there are a sequence of terms or types rn,ri2, •••, 

ri„ e subterm{t2), r2i,r22, ■■■, f2m G subterm{tz), such that tirnri2...ri„ = 
t2, and t2r2iT22---r2m = h- Hence hr nr 12... rinr2ir 22 ■■■r 2m = ts- Besides, 
since we can not eliminate constants in t2 when applying terms to it, and 
xii,ri2, ■■■,rin are constants in t2, so rn, ri2, •••, must also be the sub- 
terms of ts- Hence we have h hs ta- □ 



3.3 Application ordering with subterm restriction and variable 
freezing extension (^sf) 

The ordering ^5 is restrictive in that [a:][?/]Aa:y [x\Axa. To solve this prob- 
lem, we have: 

Definition 11 (^f)- t is a generalization of s by variable freezing, denoted as 
t yp s, if either 

— t y s, or 

— for an arbitrary type constant or term constant c such that sc is valid, 
t yp sc. 



Intuitively, here we first freeze some variables in s as a constant, then try to 
do generalization. The word freeze comes from [ 7 ], which has the notion that 
when unifying two free variables, we can regard one of them as a constant. 

The ordering yp is too general to be managed, so we have the following 
restricted form: 

Definition 12 {ypp)- t ypp s, if either 

— t yp s, or 

— For an arbitrary type constant or term constant c such that sc is valid, 
thsp sc. 

Now we have [x] [y]Axy ypp [x\Axa. The notion of ypp not only mimics, but 
also extends the usual meaning of instantiation ordering. For example, we have 
[x,y\Axy ypp [x\Axx^ which can not be obtained in the instantiation ordering. 
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Example 13 . The following relations holds: 

[a][x : a]x ^sf [ct][f '■ oi ^ a][x : a]fx 
hs [f ■■ J 7][x : j]fx 
[x ■ 'j]Ax 
hs Aa; 

if] [z, X, y]f{Axy, z) As [z, x, y]A{Axy, z) As A{Aab, Aab); 
[f][z,x,y]f{Axy,z) Aps [x,y]A{Axy, Axy); since Axy is not a subterm of 
[x,y]A{Axy,Axy). 

[a][f : a — > a][x : a]fx [<A[x ■ <x]x, since identity and projection 

functions are not subterms. 



Proposition 14. For any terms t and s, 

1 - t Asp s iff there exists a sequence ( possibly an empty sequence) of new, 
distinct constants C\, C2, ■■■, Ck, such that sciC2...Ck is of atomic type, and 

t As SCiC2...Cfe. 

3 . There exists a procedure to decide ift Asp s. 

3 . Suppose t = s. Ift Asp r, then s App r. Ifr App t, then r App s. 

Proof: 

1 . (^) Suppose t Asp s. If s is of atomic type, then it is trivial. Now suppose s 
is of type cr ^ r, c is a constant of type a. If t As s, then t As sc. If t ))s s. 
By definition ofAsp, there exists c such that t A sp sc. 

(y=) Suppose there exists a sequence of new constants ci, C2, ..., c^, such 
that sciC2...Cfe is of atomic type, and t As sciC2...Ck. By definition of Asp, 
t Asp sciC2...Cfe_i, t Asp sc\C2...Ck-2, •••, t Asp s. 

2 . Since t App s iff t sciC2...Cfe, and we know As is decidable, hence t Asp s 
is decidable. 

3 . If t Asp r, then there exists a sequence of new constants Ci, C2, ..., Ck, such 
that rc\C2...Ck is of atomic type, and tAp rc\C2...Ck. There exists a sequence 
of terms or types r\,...,ri, such that tr\...ri = rc\C2...Ck. Since t = s, we 
have sri...ri = rc\C2...Ck, s App r. 

The second proposition can be proved in the similar way. □ 



Proposition 15. Suppose ti = [A]h.SiS2...Sm, ^2 = [A']h's[s2...s'j^, and t\ App 
t2, then 

1 . m <n, 

2 . [A]sk Asp for k & {l, 2 ,...,m}, 

3 . If h is a constant, then h' must be a constant, and h — h' ,m = n. 

Proof: Suppose [A'] — [zi, Z2, ..., Zj]. Since t\ App t2, we have 
[A]hsiS 2 ...Sm As {h' s'is'2...s'^)[c/l\, where ^ is a sequence Z\,...,Zj, 
c is a sequence of new constant symbols Ci, ..., c^. Now, suppose each variable in 
h's[s2...s'j^ is fixed as a new constants, hsiS2...Sm should match h' s[s2...s'j^ in 
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the sense of [Huet78]. As we know, the complete minimal matches are generated 
by the imitation rule and the projection rule. Since in the projection rule, the 
substitutions are {h [x\^ ...,Xrr^Xi\i G m}}, which do not satisfy our 

subterm restriction. Thus the only way to match is by using the imitation rule. 
By imitation rule we have substitutions 

{h [x\, ...,Xm\h'{hiXi...Xm)---{hnXi...Xm)}, where hi,...hn are new variables. 
On the other hand, the subterms of ^ whose head is h' could only be: 

[Xi, ...,Xn]h'XiX2-..Xn, 

[X2, ...,Xn]h's'lX2-..Xn, 

[x,+i, ...,X^h' s'lS2...s'lx,+i...Xn, 

where each s" is either s'-, or other possible terms inside the arguments if 
h' also occurs in the arguments. So the possible substitution must he h ^ 
[xi+i, ...,Xn]h's”s 2 ---s”xi+i...Xn, where i + m = n. After the substitution, we 
have to match the terms h's'{...s'^_^SiS 2 ---Sm and h's[s 2 ...s'^, i.e., [A]sfe '^sf 
[^14+n-m. for ^ e {l,2,...,m}. 

When /i is a constant, it is obvious that h' = h. □ 

It is clear that '^sf is reflexive and transitive: 

Proposition 16 . For any terms t,t\,t 2 ,tz, 

1- t ^SF t. 

2- If ti hsF t 2 , t 2 hsF ts, then t\ hsF h. 

Proof: 

1. Obvious. 

2. We can suppose 

ti = [A]hSiS2—Sm, 

t2 = [A']h'rii...rus[s'2...s'^, 

h = [W']/i"r2i...r2jr3i...r3*s"s2...s" , 

Case 1: m = 0, then it is easy to verify ti >sf H- 

Case 2: m > 0. We have '^sf 'Asf [A”]s'l, for k G {!,..., m}. 

By inductive hypothesis, [A]sfe '^sf [A"]s'l. If /i is a constant, we have 
h = h' = h”,i = j = 0, thus t\ >sf If /i is a variable, let h substitute 

/i"r2i...r2jT3i...r3*. 

□ 

Definition 17 (=). t = s is defined as t '^sf s and s ^sf t. 



Example 18. [x, y\Axy = [y, x\Axy = [z, x, y]Axy. 



Proposition 19. t = s iff t is a renaming of s. 



2 



Here the variables are frozen. 
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Proof: (^) Assume t = s, then t '^sf s and s ^sf t. Suppose 
t = [A]htit2---tm, s = [A']h' s\S2---Sn- Since t ^sf s, we have m > n. And 
similarly, we have n> m. So, m = n. If is a constant, then h' = h. Similarly, 
we can have if h' is a constant then h' = h. Hence, h and h' must be either the 
same constant, or a variable. 

Case 1 . m = 0 . Obviously t and s only differs by renaming. 

Case 2 . m > 0. We have [A]tk hsF [A']sk, and [A']sk hsF [A]tk, for k G 
{!,..., m}. By inductive hypothesis, [A]tk and [A']sk only differs by variable 
renaming. On the other hand, h and h' are either variables or the same constant. 
(■<= ) We only need to consider the following two cases: 

Case 1 . Suppose 
t= [3:1 , X2, . . . , Xj] /itit2 ■ ■ .tmj 

S = ^0(2)? ■■■? ■ ■ -tm j 

Then tc±...Ci — t = s. 

Case 2 . Suppose 
t= [x][Xi,X2, ■■■,Xi]htit2...tm, 

S = {xij X2j ..., Xj}^ht\t2 • • - trm 

where x does not occur in htit2---tm- Then tc = s, s hsF t. Also, we have 
t hsF s, hence t = s. 

Case 3 . Suppose 

t = [g : 71 ^ 72 ^ ... ^ 7* ^ j]gtit2...tm, 

S = [/ ■ 70(1) 70(2) ••• 70(i) 7]/f0(l)f0(2)---f0(m)) 

tA = s([ai0(i) : 70(1), X0(2) : 70(2), •••,a^0(i) : 'y^(^)]AxlX2■■■Xi), hence s. □ 



4 Generalization 

If t '^SF Si and t '^SF S2, then t is called a common generalization of Si and S2- 
If t is a common generalization of Si and S2, and for any common generalization 
t\ of Si and S2, t\ Asf t, then t is called the least general generalization (LGG). 
This section only concerned with '^sf, hence in the following discussion the 
subscript SF is omitted. 

The following algorithm Gen{t, s, {}) computes the least general generaliza- 
tion of t and s. Recall we assume t and s are closed terms. At the beginning 
of the procedure we suppose all the bound variables in t and s are distinct. 
Here an auxiliary (the third) global variable C is needed to record the previous 
correspondence between terms in the course of generalization, so that we can 
avoid to introduce unnecessary new variables. C is a bijection between pairs of 
terms(and types) and a set of variables. Initially, C is an empty set. Following 
the usual practice, it is sufficient to consider only long /Jiy-normal forms. Not 
losing generality, suppose t and s are of the following forms: 

t = [A]h{ti,t 2 , ■■■,tk), 

s = [A'Jh'(ri, ...,ri, Si, S2, ■■■, Sk), where h and h' are atoms. Suppose 
[A,A',Ai]t'^ = Gen{[A]ti,[A']si,C), 

[A, A', A2]t'2 = Gen([A, A-\\t2, [A' , A-]\s2,C), 
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[Z\, Zi', = Gen{[A, Ak-i]tk, [A', Ak-i]sk,C), 

TypQl) = (Tl, (72, •••, CTfe ^ (Tfe+l, 

Typ{h'{ri, ....g)) = ti,T 2, ^ Tfe+i. 

Gen{t, s,C): 

Case 1: h= h' : Gen{t, s, C) = [A, A', Ak]h(t[, •••> ife)i 
Case 2: h': 

Case 2.1: 3x.{{h, h'{ri, r^)), x) € C: 

Gen{t,s,C) = [A, A', Ak]x(t[,t' 2 , 

Case 2.2: ~'3x.{{h, h'{ri, n)), x) G C, 

Case 2.2.1: Typ{h) = Typ(/i'(ri, ..., r*)): 

Genit, s,C) = [A, A', Ak][x : cri, (T2, ..., (7^ ^ ak+i]x(t'-^,t' 2 i ■■nt'k) 

C ■- {{{h,h' {n, ...,n)),x)} \J C-, 

Case 2.2.2: Typ{h) ^ Typ(h' (ri, 

Not losing generality, suppose (jj ^ Tj,j G {1, 2, ..., k,k+ 1}. 

Case 2.2.2. 1: 3aj.{{aj,Tj),aj) G C: 

Gen{t, s,C) = [A,A',Ak][x : ai, ... ak+i]x{t[,t' 2 ^ 

C := {{{h, h'{n, ..., u)), x)} U C; 

Case 2. 2. 2. 2: ~'3a.{{aj,Tj),aj) G C: 

Gen{t, s,C) = [A, A' , Ak][aj][x : ai, ...,ap ... ak+i]x{t[,t' 2 ^ ■■■,t'k)> 

C := {((/i, h'{ri,...,n)),x)}UC; 

C := {{{aj,Tj),aj)}UC. 

In the following, let t U s = Gen{t, s, {}). 

Example 20. Some examples of least general generalization. 

[x : ')]x U Aa=[x : [a] [y : a]y = [a] [y : a]y, if Aa is not of type 7; 

[x : 'j]x U Aa= [x : 7] [y : 7]?/ = [x : 'y\x, if Aa is of type 7; 

[x\AxxU [x\Aax = [x^y\Axy, 

Aa U Bb^ [/] [x]fx, if A and B is of the same type; 

Aa U Bh'^ [a][f : a ^ 7] [x : a]fx, if : 71 ^ 7 and i? : 72 ^ 7; 

Example 21. Here is an example of generalizing segments of programs. For clar- 
ity the segments are written in usual notation. Let t = [x]mapl{cons{a, x)) = 
cons{succ{a) , mapl{x)), 

s = [x]map2{cons{a, x)) = cons{sqr{a),map2{x)). 

Suppose the types are 

mapl : List(Nat) Nat; succ : Nat ^ Nat, 
map2 : List{Nat) — > Nat; sqr : Nat ^ Nat. 

Then 

t U s = [/ : List(Nat) Nat;g : Nat Nat)][x]f{cons{a,x)) = cons{g{a), 

fix)))- 

The termination of the algorithm is obvious, since we recursively decom- 
pose the terms to be generalized, and the size of the terms strictly decreases in 
each step. What we need to prove is the uniqueness of the generalization. The 
following can be proved by induction on the definition of terms: 
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Proposition 22. 1. (consistency) tU s yt, tU s y s. 

2. (termination) For any two term t and s, Gen{t, s, {}) terminates. 

3. (absorption) If ty s, then tU s = t. 

4- (idempotency) tUt = t. 

5. ( commutativity) t U s = s U t. 

6. (associativity) (t U s) U r = t U (s U r). 

7. Ift = s, then tUr = sUr. 

8. (monotonicity) If ty s, then for any term r, t U r ^ s U r. 

9. If t = s, then tU s = t = s. 

Proof: 

1. It can be verified that for each case of the algorithm, we obtained a more 
general term. 

2. It is obvious since we decompose the terms recursively. 

3. Since t y s, we can suppose 
t = [A]hSiS2...Sm, 

s = [A']h'rii...riis'is' 2 ...s'^,dmd 
[A]sk y [A']s'^,k G m}. 

If TO = 0, then it is easy to verify the conclusion. Now suppose to > 0. Not 
losing generality, suppose /i is a variable which does not occur in s, and has 
a single occurrence in t. h has the same type as h'rii...ru. Other cases can 
be proved in a similar way. Now we can suppose tU s = [A”][f]ftit 2 ...tm- 
If Sk is a constant, then sj. must be the same constant. Hence tk = Sk- If Sk 
is a variable, then tk is a new variable. There are two cases: one is Sk has 
only one occurrence in t. Then t' and t only differs by renaming. The other 
case is that Sk has multiple occurrences in t. Since t y s, all the occurrences 
of Sfe must correspond to a same term in s. Hence due to the presence of the 
global variable C, all the occurrences of Sk are generalized as a same variable. 
Hence t = t' . By inductive hypothesis, we have 
[A]sk U [A]sk. 

4. From t y t and the proposition 7.3 we can have the result. 

5. It is obvious from the algorithm. 

6. Not losing generality, we can suppose 
t = [A]hSiS2...Sm, 

s = [A']h'rii...rus[s 2 ...s'^, 
r = [Z\"]/i"r2i...r2jT3i...r3*s"s2...s" , 

and suppose h, h' , h” are distinct constants, t, s, r are of the same type. The 
other cases can be proved in a similar way. By inductive hypothesis, for 
k G {1, ..., m},p G {1, ..., i}, we can suppose: 

{[A]sk U [Ay,) U [A'yi - U {[A'K U [A"K), 

[A]sk u [Ay, ^ [F]tk, 

[F]tkuyy(^[ryi 

[Ay,u[A'y(^[ry„ 

[A]skU[ry,^[ry(, 

[Z\']ripU [A"]r3p ^ [r']rp. 
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Here we suppose each _T, F', F" are large enough to cover all the abstractions 
and respectively. 

We rename the variables in [F]ti, [F]tm such that there are multiple occur- 
rences of a variable x in [F]ti, ...,[F]tm^i and only if its corresponding places 
in Si, Sm hold a same term, and its corresponding places in s'^, hold 

another same term. Similarly, we rename the terms [F']t'i^, [F”]t'l. Then 
(t U s) U r 

- [Z\"]/l"r21...r2jT3i...r3*s"S2...s" 

- [F"][f]ft'I...C, 
tU (sUr) 

= [A]hsiS2...SmLi [F'][g]gn...rit[...t'^ 

- [r"][g]<...C- 

Hence (t U s) U r = t U (s U r). 

7. Since t = s, t is a renaming of s. t and s must be of the forms [Z\]/irir 2 ...r„ 
and [A']hrir 2 ...rn- It is obvious that [Z\]/irir 2 ...r„ U r = [Z\']/irir 2 ...r„ U r. 

8. Since t y s, we have tU s = t, hence 
t U r 

= (t U s) U r (by proposition 7.7) 

= t U (s U r) (commutativity) 

^ s U r (by proposition 7.1). 

9. From t = s , we have t y s, s y t. Hence tUs = t, tUs = sUt = s. □ 

Based on the above propositions, we can have 

Theorem 23. tU s is the least general generalization oft and s, i.e., for any 
term r, if r y t,r y s, then r ytU s . 

Proof: Since r y t,r y s, we have rUt = r, rUs = r. 
r U (t U s) 

= (r U r) U (t U s) (idempotency) 

= (r U t) U (r U s) (commutativity and associativity) 

= r U r (absorption) 

= r (idempotency). 

Hence by proposition 7.1 we have r y tU s. □ 



Higher order generalization is mainly used to find schemata of programs, 
proof, or program transformations. For example, given first order clauses 
multiply {s{X), Y, Z) ^ multiply {X, Y, W), addfW, Y, Z), and 
exponent{s{X) , Y, Z) ^ exponent{X, Y, W), multiply {W, Y, Z), 
we can obtain its least general generalization as 
P{s{X), F, Z) ^ P{X, Y, W), Q{W, Y, Z). 

Higher order generalization also finds its applications in analogy analysis [9]. 
It is commonly recognized that a good way to obtain the concrete correspondence 
between two problems is to obtain the generalization of the two problem first. 
During the generalization process, we should preserve the structure as much as 
possible. By using the above higher order generalization method, we can find the 
analogical correspondence between two problems in the course of generalization. 
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5 Discussions 

With the subterm restriction and the freezing extension, we defined the ordering 
'^SF- As we have shown, this ordering and the corresponding generalization have 
nice properties almost the same as the first order anti-unification. Especially, the 
least general generalization exists and is unique. 

To have a comparison with other kinds of generalizations, we have the fol- 
lowing diagram: 



hsF 



hF 




Here each vertex represents a kind of ordering. For example, '^h means the 
usual instantiate ordering in a higher order language, say AP2 [1]. the usual 
instantiation ordering in first order language, hM\ the ordering in MX, >l\ the 
ordering in LX (i.e., in higher order patterns), etc.. The arrow means implication. 
For example, if t s, then t '^sf s, and t >h s. It can be seen that the relations 

>SF and (also '^l\ and >m\ ) are not comparable. By definition, ^15 (the 
ordering with the subterm restriction) is the same as hi. That explains why 
we have good results in '^sf- 

Our work differs from the others in the following aspects. Firstly, we de- 
fined a new ordering hsF- In terms of this ordering, we obtain a much more 
specific generalization in general. For example, the terms Aab and Bab would 
be generalized as a single variable x in [12], or as fts in [3], where t and s are 
arbitrary terms. In contrast, we will have [f]fab as its least general generaliza- 
tion. Secondly, our approach can produce a meaningful generalization of terms 
of different types and terms of different arities, instead a single variable x. And 
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finally, our method is useful in applications, such as in analogical reasoning and 

inductive inference [5] [9] . 
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The Topic 

Logics of action and change were originally conceived for the design of intelligent 
agents, as McCarthy proposed the situation calculus as the logic to be used 
by a logic-based “advice taker”. In more recent research one can distinguish 
two motives for these logics, namely, for what is now called cognitive robotics 
and for common-sense reasoning. The difference is that in cognitive robotics, 
the logic must also account for an incoming flow of sensory data and for the 
actuation of actions by the robot, whereas common-sense reasoning is oriented 
towards natural-language communication of the premises and the conclusions of 
the reasoning process. 

The present article addresses the case where a logic of actions and change is 
used for cognitive robotics purposes. An intelligent robotic agent will necessarily 
have a quite complex design, since it has to account for all of the following: 

1. Goal-directed behavior, including the principle of deliberated retry, that is, 
if the robot takes an action in order to pursue a goal and the action fails, then 
it is to consider some other action or sequence of actions that is likely to achieve 
the same goal. 

2. World modeling and the description of the effects of actions on several 
levels of detail, ranging from simple approximations such as precondition/ post- 
condition descriptions, to the detailed specification in control theory terms of 
how an action is performed. 

3. The problem of imprecise and unreliable sensors. 

4. The occurrence of exogenous events, including both the prediction and 
early reaction to those exogenous events that can reasonably be predicted, and 
the proper dealing with those exogenous events that come as a surprise to the 
agent. In fact, it is not sufficient that the agent be able to deal with exoge- 
nous events in prediction mode and synchronous mode (recognizing them at the 
moment they occur), but it must also sometimes be able to diagnose aberrant 
situations and understand them in terms of exogenous events that may have 
occurred earlier in time. In other words, it must be capable of postdiction as 
well as prediction. 
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In order to master the design of such complex devices, it is necessary to 
make use of concise and high-level specifications. It is somewhat remarkable, 
therefore, that most research on logics of actions and change has avoided the 
issues mentioned above. It has focused on the actions performed by the agent, the 
restrictions that may be imposed on them, and their indirect effects due to some 
kind of causation. These logics have not generally been used for characterizing 
the behavior of the agent, goal-directed or otherwise, and there has only been a 
small number of contributions on reasoning about unreliable sensor data. 

Research in the area of intelligent systems have approached some of these 
topics from another angle, and in a few recent cases they have also used logic 
as a means of characterizing the system design, complementing the traditional 
approach of describing the software architecture directly. 

We believe that this area at the intersection of two current research traditions 
- reasoning about actions and change, and intelligent agent systems - will attract 
a lot more attention in the forthcoming years. Stringent and high-level specifi- 
cations of intelligent systems will be necessary due to their inherent complexity 
and the severe requirements in practical applications, including the requirement 
of being able to assure correct behavior within a well defined envelope for what 
situations are assumed to arise. It will be natural to use logic for characterizing 
the behavior of the agent, and not only for characterizing the world in which the 
agent operates. 

Our Approach 

In the present invited lecture, we are going to describe our own recent work in 
the area that has now been described. In particular, [j-etai-1-105] (also available 
as [c-kr-98-304]) describes a method for characterizing the goal-directed behavior 
of deliberated retry in a first-order logic of time and action. Also, [j-aicom-9-214] 
and [c-hart-97-3] describe a method of relating high-level and low-level action 
descriptions in a similar logic. The results in those articles are complementary: 
the first articles address item 1 in the list of issues that was given above, whereas 
the latter two articles address items 2, 4, and to some extent item 3 in that list. 

The common logical framework for these articles is Time and Action Logic 
(TAL), that is, the direct representation of actions and of the state of the world 
in first-order predicate calculus. (This is the representation that we have been 
using consistently since 1989, although previously without a specific label). In 
TAL each interpretation represents one possible history of the world, in con- 
tradistinction to e.g. the situation calculus where an interpretation contains a 
tree of possible developments. (The difference is actually a bit less stringent, on 
both sides, but that is outside the present topic). For the characterization of 
goal-directed agent behavior, each model will therefore represent one history of 
the world in which the agent exhibits the required behavior pattern of deliberated 
retry in the face of the problems that may arise. 

In combining the results of those earlier articles, we obtain a logic that is 
able to represent the failure of actions due to exogenous events, and an agent 
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design where the agent is able to revise its plan and make a new attempt to 
reach a goal after its present approach has failed. 

The present lecture will review the contents of the previous articles, and show 
through a concrete example how the combined system works. 

Related Work 

As a joint reference for these and other articles, we are maintaining a webpage 
structure with a summary of related work by others and links to those other 
articles. This structure is continuously extended with new references and com- 
mentary, so it is kept up-to-date in a way that a reference list in an article can 
not be. Please refer to the URL of the ETAI article for links to all such further 
information in its up-to-date form. 
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Abstract. We overview the design and implementation of Jinni^ (Java 
INference engine and Networked Interactor), a lightweight, multi-threa- 
ded, pure logic programming language, intended to be used as a flexible 
scripting tool for gluing together knowledge processing components and 
Java objects in networked client/server applications, as well as through 
applets over the Web. 

Mobile threads, implemented by capturing first order continuations in a 
compact data structure sent over the network, allow Jinni components to 
interoperate with remote high performance BinProlog servers for CPU- 
intensive knowledge processing and with other Jinni components over 
the Internet. 

These features make Jinni a perfect development platform for intelligent 
mobile agent systems. 

Keywords: Java based Logic Programming languages, remote execu- 
tion, Linda coordination, blackboard-based distributed logic program- 
ming, mobile code through first order continuations, intelligent mobile 
agents 



1 The world of Jinni 

Jinni is based on a simple Things, Places, Agents ontology, borrowed from 
MUDs and MOOs [14,1,3,9,18,15]. 

Things are represented as Prolog terms, basically trees of embedded records 
containing constants and variables to be further instantiated to other trees. 

Places are processes running on various computers with a server component 
listening on a port and a blackboard component allowing synchronized multi- 
user Linda [6,10] and remote predicate call transactions. 

Agents are collections of threads executing a set of goals, possibly spread 
over a set of different Places and usually executing remote and local transactions 
in coordination with other Agents. Asynchronous communication with other 
agents is achieved using the Linda coordination protocol. 

Places and Agents are clonable, support inheritance/sharing of Things and 
are designed to be easily editable/configurable using visual tools. Agent threads 

^ Available at http://www.es. unt.edu/"'tarau/netjinni/Jinni.html 
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moving between places and agents moving as units are supported. Places are 
used to abstract away language differences between processors, like for instance 
Jinni and BinProlog. Mobile code allows very high speed processing in Jinni 
by delegating heavy inference processing to high-performance BinProlog compo- 
nents. 



2 Jinni as a Logic Programming Java component 

Jinni is implemented as a lightweight, thin client component, based as much as 
possible on fully portable, platform, version and vendor independent Java code. 
Its main features come from this architectural choice: 

— a trimmed down, simple, operatorless syntactic subset of Prolog, 

— pure Prolog (Horn Clause logic) with leftmost goal unfolding as inference 
rule, 

— multiple asynchronous inference engines each running as a separate thread, 

— a shared blackboard to communicate between engines using a simple Linda- 
style subscribe/publish (in/out in Linda jargon) coordination protocol based 
on associative search, 

— high level networking operations allowing code mobility [2,12,11,4,19,13] and 
remote execution, 

— a straightforward Jinni-to-Java translator allows packaging of Jinni programs 
as Java classes to be transparently loaded over the Internet 

— backtrackable assumptions [17,8] implemented through trailed, overridable 
undo actions 

Jinni’s spartan return to (almost) pure Horn Clause logic does not mean it is 
necessarily a weaker language. Expressiveness of full Prolog (and beyond:-)) is 
easily attained in Jinni by combining multiple engines. Engines give transparent 
access to the underlying Java threads and are used to implement local or remote, 
lazy or eager findall operations, negation as failure, if-then-else, etc. at source 
level. Inference engines running on separate threads can cooperate through ei- 
ther predicate calls or through an easy to use ffavor of the Linda coordination 
protocol. Remote or local dynamic database updates make Jinni an extremely 
ffexible Agent programming language. Jinni is designed on top of dynamic, fully 
garbage collectible data structures, to take advantage of Java’s automatic mem- 
ory management. 

3 What’s new in Jinni 

3.1 Engines 

An engine can be seen as an abstract data-type which produces a (possibly 
infinite) stream of solutions as needed. To create an new engine, we use: 

new_engine (Goal .Answer .Handle) 
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The Handle is a unique Java Object denoting the engine, assigned to its own 
thread, for further processing. 

To get an answer from the engine we use: 

ask_engine (Handle , Answer) 

Each engine backtracks independently using its choice-point stack and trail dur- 
ing the computation of an answer. Once computed, an answer is copied from an 
engine to the master engine which initiated it. 

Multiple engines enhance the expressiveness of Prolog by allowing an AND- 
branch of an engine to collect answers from multiple OR-branches of other en- 
gine(s). We call this design pattern orthogonal engines. They give to the pro- 
grammer the means to see as an abstract sequence and control, the answers 
produced by an engine, in a way similar to Java’s Enumeration interface. In 
fact, by using orthogonal engines, a programmer does not really need to use 
findall and other similar predicates anymore - why accumulate answers eagerly 
on a list which will get scanned and decomposed again, when answers can be 
produced on demand? 



3.2 Coordination and remote execution mechanisms 

Our networking constructs are built on top of the popular Linda [6,10,7] coor- 
dination framework, enhanced with unification based pattern matching, remote 
execution and a set of simple client-server components melted together into a 
scalable peer-to-peer layer, forming a ‘web of interconnected worlds’. The basic 
operations are the following: 

— in(X): waits until it can take an object matching X from the server 

— out (X) : puts X on the server and possibly wakes up a waiting in/1 operation 

— alKXjXs): reads the list Xs matching X currently on the server 

— the (Pattern, Goal, Answer): remotely runs a thread executing Goal on the 
server and collects it’s first Answer, of the form the (Pattern) if successful, 
and the atom no otherwise 

The presence of the all/2 collector compensates for the lack of non-deterministic 
operations. Note that the only blocking operation is in/1. Blocking rd/ 1 is easily 
emulated in terms of in/ 1 and out/ 1 , while non-blocking rd/ 1 is emulated with 
all/2. 



3.3 Server-side constraint solving 

A natural extension to Linda is to use constraint solving for selection matching 
terms, instead of plain unification. This is implemented in Jinni through the use 
of 2 builtins: 

WaitJor(Term, Constraint): waits for a Term such that Constraint is true on 
the server, and when this happens, it removes the result of the match from the 
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server with an in/1 operation. Constraint is either a single goal or a list of goals 
[Gl,G2,..,Gn] to be executed on the server. 

Notify .about (Term): notifies the server to give this term to any blocked client 
which waits for it with a matching constraint i.e. 

notify_about (stock_offer (nscp,29) ) 

would trigger execution of a client having issued 

wait_for(stock_offer(nscp, Price) ,less(Price,30)) . 

3.4 Mobile Code: for expressiveness and for acceleration 

An obvious way to accelerate slow Prolog processing for a Java based system is 
through use of native (C/C++) methods. The simplest way to accelerate Jinni’s 
Prolog processing is by including BinProlog through Java’s JNI. 

However, a more general scenario, also usable for applets not allowing native 
method invocations is use of a remote accelerator. This is achieved transparently 
through the use of mobile code. 

The Oz 2.0 distributed programming proposal of [19] makes object mobility 
more transparent, although the mobile entity is still the state of the objects, not 
“live” code. 

Mobility of “live code” is called computation mobility [5]. It requires inter- 
rupting execution, moving the state of a runtime system (stacks, for instance) 
from one site to another and then resuming execution. Clearly, for some lan- 
guages, this can be hard or completely impossible to achieve. 

Telescript and General Magic’s new Odissey [11] agent programming frame- 
work, IBM’s Java based aglets [12] as well as Luca Cardelli’s Oblique [2] have 
pioneered implementation technologies achieving computation mobility. 

In the case of Jinni, computation mobility is used both as an accelerator and 
an expressiveness lifting device. A live thread will migrate from Jinni to a faster 
remote BinProlog engine, do some CPU intensive work and then come back with 
the results (or just sent back results, using Linda coordination). A very simple 
way to ensure atomicity and security of complex networked transactions is to 
have the agent code move to the site of the computation, follow existing security 
rules, access possibly large databases and come back with the results. 

Jinni’s mobile computation is based on the use of first order continuations i.e. 
encapsulated future computations, which can be easily suspended, moved over 
the network, and resumed at a different site. As continuations are first-order ob- 
jects both in Jinni and BinProlog, the implementation is straightforward [16] and 
the two engines can interoperate transparently by simply moving computations 
from one to the other. 

Note that a unique move/0 operation is used to transport computation to 
the server. The client simply waits until computation completes, when bindings 
for the first solution are propagated back. 

Note that mobile computation is more expressive and more efficient than 
remote predicate calls as such. Basically, it moves once, and executes on the 
server all future computations of the current AND branch. 
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4 Application domains 

Jinni’s client and server scripting abilities are intended to support platform inde- 
pendent Prolog-to-Java and Prolog-to-Prolog bidirectional connection over the 
net and to accelerate integration of the effective inference technologies devel- 
oped the last 20 years in the field of Logic Programming in mainstream Internet 
products. 

The next iteration is likely to bring a simple, plain English scripting language 
to be compiled to Jinni with speech recognizer/synthesizer based I/O. 

Among the potential targets for Jinni based products: lightweight rule based 
programs assisting customers of Java enables appliances, from Web based TVs 
to mobile cell phones and car computers, all requiring knowledge components to 
adjust to increasingly sophisticated user expectations. 

A stock market simulator is currently on the way to be implemented based 
on Jinni, featuring user programmable intelligent agents. It is planned to be 
connected to real world Internet based stock trade services. 

5 Conclusion 

The Jinni project shows that Logic Programming languages are well suited as 
the basic glue so much needed for elegant and cost efficient Internet program- 
ming. The ability to compress so much functionality in such a tiny package shows 
that building logic programming components to be integrated in emerging tools 
like Java might be the most practical way towards mainstream recognition and 
widespread use of Logic Programming technology. Jinni’s emphasis on function- 
ality and expressiveness over performance, as well as it’s use of integrated multi- 
threading and networking, hint towards the priorities we consider important for 
future Logic Programming language design. 

Acknowledgments 

We thank for support from NSERC (grants OGP0107411), the Universite de 
Moncton, Louisiana Tech University as well as from E-COM Inc. and the Ra- 
diance Group Inc. Special thanks go to Veronica Dahl, Bart Demoen, Koen 
De Boschere, Ed Freeman, Don Garrett, Stephen Rochefort and Yu Zhang for 
fruitful interaction related to the design, implementation and testing of Jinni. 




390 



Paul Tarau 



References 

1. The Avalon MUD. http://www.avalon-rpg.com/. 

2. K. A. Bharat and L. Cardelli. Migratory applications. In Proceedings of the 
8th Annual ACM Symposium on User Interface Software and Teehnology, Nov. 
1995. http:/ /gatekeeper. dec. com/pub/DEC/SRC/research-reports/ abstracts/src- 
rr-138.html. 

3. BlackSun. CyberGate. http://www.blaxxsun.com/. 

4. L. Cardelli. Mobile ambients. Technical report, Digital, 1997. http://www. research. 
digital.com/ SRC/personal/Luca_Cardelli/Papers.html. 

5. L. Cardelli. Mobile Computation. In J. Vitek and C. Tschudin, editors. Mobile 
Objeet Systems - Towards the Programmable Internet, pages 3-6. Springer- Verlag, 
LNCS 1228, 1997. 

6. N. Carriero and D. Gelernter. Linda in context. CACM, 32(4):444-458, 1989. 

7. S. Castellani and P. Ciancarini. Enhancing Coordination and Modularity Mecha- 
nisms for a Languag e with Objects-as-Multisets. In P. Ciancarini and C. Hankin, 
editors, Proe. 1st Int. Conf. on Coordination Models and Languages, volume 1061 
of LNCS, pages 89-106, Cesena, Italy, April 1996. Springer. 

8. V. Dahl, P. Tarau, and R. Li. Assumption Grammars for Processing Natural Lan- 
guage. In L. Naish, editor, Proceedings of the Fourteenth International Conference 
on Logie Programming, pages 256-270, MIT press, 1997. 

9. K. De Bosschere, D. Perron, and P. Tarau. LogiMOO: Prolog Technology for 
Virtual Worlds. In Proceedings of PAP’96, pages 51-64, London, Apr. 1996. 

10. K. De Bosschere and P. Tarau. Blackboard-based Extensions in Prolog. Software 
— Practiee and Experienee, 26(l):49-69, Jan. 1996. 

11. GeneralMagicInc. Odissey. 1997. available at http://www.genmagic.com/agents. 

12. IBM. Aglets, http://www.trl.ibm.co.jp/aglets. 

13. E. Jul, H. Levy, N. Hutchinson, and A. Black. Fine-Grained Mobility in the Emer- 
ald System. ACM Transactions on Computer Systems, 6(1):109-133, February 
1988. 

14. T. Meyer, D. Blair, and S. Hader. WAXweb: a MOO-based collaborative hyper- 
media system for WWW. Computer Networks and ISDN Systems, 28(l/2):77-84, 
1995. 

15. P. Tarau. Logic Programming and Virtual Worlds. In Proceedings of INAP96, 
Tokyo, Nov. 1996. 

16. P. Tarau and V. Dahl. Mobile Threads through First Order Continuations. 1997. 
submitted, http: / /clement. info. umoncton.ca/html/tmob/html.html. 

17. P. Tarau, V. Dahl, and A. Fall. Backtrackable State with Linear Affine Implication 
and Assumption Grammars. In J. Jaffar and R. H. Yap, editors, Coneurreney and 
Parallelism, Programming, Networking, and Security, Lecture Notes in Computer 
Science 1179, pages 53-64, Singapore, Dec. 1996. ’’Springer”. 

18. P. Tarau and K. De Bosschere. Virtual World Brokerage with BinProlog and 
Netscape. In P. Tarau, A. Davison, K. De Bosschere, and M. Hermenegildo, editors. 
Proceedings of the 1st Workshop on Logie Programming Tools for INTERNET Ap- 
p/ications, JICSLP’96, Bonn, Sept. 1996. http: //clement. info. umoncton.ca/'lpnet. 

19. P. Van Roy, S. Haridi, and P. Brand. Using mobility to make transparent distri- 
bution practical. 1997. manuscript. 




